This release is made to be consistent with dplyr
group_by() .drop behaviour.
.dropfor dropping empty factor or not in
-operator between yearweek, yearmonth, and yearquarter returns class
as_tsibble.ts()for monthly series starting at other months than January. (#89)
guess_frequency.yearweek()returns 52.18 for more accurate weekly representation, instead of 52.
n()now can be called in
*_join()for not finding key or index when
byis specified. (#102)
data(tourism)and 2017 data.
The tsibble’s data structure and API reach to the lifecycle of stability.
v0.8.0grouped data frames, tsibble allows for empty key values and disregards the lazily stored key. All operations now recalculate the keying structure.
grouped_ts) is a subclassing of
.sizeis retired in
stretch()in favour of
stretch()gained a new
.fill = NAargument, which returns the same length as the input. To restore the previous behaviour, please use
.fill = NULL. (#88)
rbind()for dropping custom index class. (#78)
count_gaps()for dropping custom index class.
count_gaps()now only summarises keys with gaps instead of all the keys.
Inf. (@jeffzi, #84)
fill_gaps()returns a grouped tsibble too.
scan_gaps()joins the family of implicit missing values handlers.
future_. It requires the furrr package to be installed. (#66)
This release simplifies the “key” structure. The nesting and crossing definition has been removed from the “key” specification. One or more variables forming the “key”, are required to identify observational units over time, but no longer assume the relationship between these variables. The nesting and crossing structure will be dealt with visualisation and forecasting reconciliation in downstream packages.
count_gaps.tbl_ts()returns a tibble containing gaps for each key value rather than an overall gap, which is consistent with the rest of tsibble methods. And all output column names that are not supplied by users gain a prefixed “.”.
intervalinput instead of time vectors to avoid overheads, also marked as internal function.
pslider()as new functions
.partialis removed from
pslider()to feature a simpler interface.
build_tsibble(). In order to construct a grouped tsibble,
xrequires a grouped df.
has_gaps()to quickly check if there are implicit time gaps for each key in a tsibble.
new_data()to produce the future of a tsibble.
filter_index()to filter time window for a tsibble.
time_in()to check if time falls in the ranges in compact expression, with no need for time zone specification.
new_tsibble()creates a subclass of a tsibble.
fill_gaps(), for more expressive function name and consistency to
POSIXct, time zone will be displayed in the header via
holiday_aus()that requires package “timeDate”.
fill_na.tbl_ts()scoping issue (#67).
slice.tbl_ts()correctly handles logical
fill_na()will only replace implicit time gaps by values and functions, and leave originally explicit
tidyr::fill()gained support for class “grouped_ts”, and it is re-exported again. (#73)
fill_na(), in favour of
find_duplicates(), in favour of
case_na(), and will be defunct in next release.
split_by(), which is under development as S3 generic in dplyr.
.dropargument in column-wise verbs, and suggested to use
select()doesn’t select index, it will inform users and automatically select it.
append_row()for easily appending new observations to a tsibble. (#59)
/, consistent with
unnest.lst_tsrespects the ordering of “key” values. (#56)
nest.tbl_ts()respect the appearance ordering of input variables. (#57)
key_indices()return consistent formats as its generic.
keyno longer accepted character.
This release introduced the breaking changes to the “interval” class to make tsibble better support finer time resolution (e.g. millisecond, microsecond, and nanosecond). The “interval” format changes from upper case to short hand. To support new time index class, only
pull_interval() need to be defined now.
group_by_key()to easily group the key variables.
slide()gained a new argument
.align = "right"to align at “right”, “center”, or “left”. If window size is even for center alignment, either “center-right” or “center-left” is needed.
-) for yearweek, yearmonth, and yearquarter.
stretch()gained a new argument
.bind = FALSE.
0in the “interval” class to make the representation simpler.
intervalclass has new slots of “millisecond”, “microsecond”, “nanosecond”.
time_unit()is a function instead of S3 generic, and made index extension a bit easier.
purrrstyle exactly (#35):
stretch()return lists only instead of numerics before.
pslide()to map over multiple inputs simultaneously.
slide()gained a new argument
.partialto support partial sliding.
pstretcher()support multiple inputs now, and split them in parallel.
holiday_aus()for Australian national and state-based public holiday.
diff()for year-week, year-month, and year-quarter.
yearquarter()supported for character.
pstretch()to slide over multiple inputs simultaneously (#33).
units_since()for index classes.
is_53weeks()for determine if the year has 53 ISO weeks.
key_sum()for extending tsibble.
tspattribute from the
count_gapswhen a tsibble of unknown interval.
as_tsibble.grouped_ts()now return self (#34).
id()is used in the tsibble context (e.g.
build_tsibble()) regardless of the conflicts with dplyr or plyr, to avoid frustrating message (#36).
select.tbl_ts()now preserved index.
as.ts.tbl_ts()for ignoring the
valueargument when the key is empty.
[.tbl_ts()when subsetting columns by characters (#30).
fill_na.tbl_ts()dropping custom index class (#32).
format.yearweek()due to the boundary issue (#27).
The tsibble package has a hexagon logo now! Thanks Mitch (@mitchelloharawild).
index2will be part of grouping variables.
This release (hopefully) marks the stability of a tsibble data object (
tbl_ts contains the following components:
key: single or multiple columns uniquely identify observational units over time. A key consisting of nested and crossed variables reflects the structure underlying the data. The programme itself takes care of the updates in the “key” when manipulating the data. The “key” differs from the grouping variables with respect to variables manipulated by users.
index: a variable represents time. This together the “key” uniquely identifies each observation in the data table.
index2: why do we need the second index? It means re-indexing to a variable, not the second index. It is identical to the
indexmost time, but start deviating when using
index_by()works similarly to
group_by(), but groups the index only. The dplyr verbs, like
mutate(), operates on each time group of the data defined by
index_by(). You may wonder why introducing a new function rather than using
group_by()that users are most familiar with. It’s because time is indispensable to a tsibble,
index_by()provides a trace to understanding how the index changes. For this purpose,
group_by()is just too general. For example,
summarise()aggregates data to less granular time period, leading to the update in index, which is nicely and intuitively handled now.
intervalclass to save a list of time intervals. It computes the greatest common factor from the time difference of the
indexcolumn, which should give a sensible interval for the almost all the cases, compared to minimal time distance. It also depends on the time representation. For example, if the data is monthly, the index is suggested to use a
yearmonth()format instead of
Dateonly gives the number of days not the number of months.
regular: since a tsibble factors in the implicit missing cases, whether the data is regular or not cannot be determined. This relies on the user’s specification.
ordered: time-wise and rolling window functions assume data of temporal ordering. A tsibble will be sorted by its time index. If a key is explicitly declared, the key will be sorted first and followed by arranging time in ascending order. If it’s not in time order, it broadcasts a warning.
tsummarise()and its scoped variants. It can be replaced by the combo
tsummarise()provides an unintuitive interface where the first argument keeps the same size of the index, but the remaining arguments reduces rows to a single one. Analogously, it does
summarise(). The proposed
index_by()solves the issue of index update.
find_duplicates()to better reflect its functionality.
group_vars()return a vector of characters instead of a list.
distinct.tbl_ts()now returns a tibble instead of an error.
tidyr::fill(), as they respect the input structure.
index_sum(), and replaced by
index_valid()to extend index type support.
fill_na.tbl_ts()gained a new argument of
.full = FALSE.
.full = FALSE(the default) inserts
NAfor each key within its time period,
TRUEfor the entire time span. This affects the results of
fill_na.tbl_ts()as it only took
TRUEinto account previously. (#15)
.dropin column-wise dplyr verbs.
group_by.tbl_ts()behaves exactly the same as
group_by.tbl_dfnow. Grouping variables are temporary for data manipulation. Nested or crossed variables are not the type that
tbl_tsgains a new attribute
index2, which is a candidate of new index (symbol) used by
attr(grouped_ts, "vars")stores characters instead of names, same as
This release introduces major changes into the underlying
tbl_tsclass to reduce the object size, and computed on the fly when printing.
tbl_tsobject is a symbol now instead of a quosure.
tbl_tsobject is an unnamed list of symbols.
key_update()to change/update the keys for a given tsibble.
unkey()as an S3 method for a tsibble of key size < 2.
key_indices()as an S3 method to extract key indices.
split_by()to split a tsibble into a list of data by unquoted variables.
build_tsibble()allows users to gain more control over a tsibble construction.
as_tsibble.msts()for multiple seasonality time series defined in the forecast package.
stretch(), are no longer defined as S3 methods. Several new variants have been introduced for the purpose of type stability, like
slide_dfr()(a row-binding data frame),
slide_dfc()(a column-binding data frame).
indexvariable must sit in the first name-value pair in
tsummarise()instead of any position in the call.
transmute.tbl_ts()keeps the newly created variables along with index and keys, instead of throwing an error before.
This release marks the complete support of dplyr key verbs.
inform_duplicates()informs which row has duplicated elements of key and index variables.
tsummarise.tbl_ts(), when calling functions with no parameters like
tsummarise.tbl_ts(), one grouping level should be dropped for the consistency with
dplyr::summarise()for a grouped
tbl_tsare supported in
as_tsibble(). An empty tsibble is not allowed.
group_by.tbl_ts(.data, ..., add = TRUE)works as expected now.