NEWS

SEQTaRget 1.4.3.9000

Allow arm-specific treatment-weight models: numerator and denominator in SEQopts() now also accept a character vector with one formula per treat.level (in treat.level order), fitting a separate numerator/denominator model - with its own covariates - in each treatment arm. Only the estimates from the respective arm's model are used to construct that arm's weights. A single string keeps the previous behaviour (the same formula in every arm). Arm-specific formulas are only supported for weights estimated on the post-expansion data (weight.preexpansion = FALSE); the LTFU, visit, and outcome models are unaffected.

SEQTaRget 1.4.3 (2026-06-23)

Fix "Risk Differerence" typo in risk.times output
Report follow-up per treatment arm in the @info slot as info$followup.unique and info$followup.nonunique (per subgroup, mirroring info$outcome.unique / info$outcome.nonunique). Both are grouped by baseline treatment over expanded-data rows with an observed (non-NA) outcome - the person-time the outcome model is fit on. The non-unique table counts follow-up intervals (so the non-unique outcome counts divided by these give per-arm event rates); the unique table counts the distinct subjects contributing follow-up to each arm. Also shown in the printed diagnostic tables.
Speed up the hazard ratio calculation by fitting the Cox model with the survival C fitters directly on a prebuilt design matrix instead of coxph(formula, data), avoiding the model.frame/model.matrix rebuild on every bootstrap iteration: survival::coxph.fit() for the non-competing-event model and survival::agreg.fit() for the competing-event Fine-Gray (counting-process) model. The hazard ratio and CIs are unchanged.
Fix the competing-event Fine-Gray hazard fit to use the finegray() case weights (fgwt), which are required for a valid subdistribution-hazard estimate and were previously omitted. This is a no-op for the current hazard simulation (which has only administrative censoring, so all fgwt are 1) but corrects the estimate should the simulated data ever carry random censoring.
Report competing events per treatment arm in the @info slot as info$compevent.unique and info$compevent.nonunique, mirroring the structure of info$outcome.unique / info$outcome.nonunique. Both are grouped by baseline treatment; the non-unique table counts all competing event occurrences in the expanded data and the unique table counts distinct subjects who experienced the competing event. Both are NA when no compevent is specified.
From a SEQuential() fit, populate weight.statistics and outcome.model when hazard = TRUE.
Warn when the numerator and denominator weight models are given identical covariates. In that case the stabilized weights all equal 1 (i.e., no weighting), which is usually a typo in the denominator argument.
Improve the SEQuential() helpfile by adding a per-protocol example.
Add behavioural tests that selection.random = TRUE retains all treated trial-starts, subsamples control trial-starts to the requested selection.prob fraction, and is reproducible under a fixed seed; rename the previous smoke test that did not actually exercise the feature.
Add behavioural tests for; weight.lower/weight.upper truncation, weight.p99 truncation, followup.include/trial.include, followup.class, weight.lag_condition, followup.min/followup.max, and weight.eligible_cols.
Fix numerator() and denominator() returning NULL for every weighted model; they now return the fitted per-arm numerator/denominator weight models.
Document that weight truncation applies only to the outcome-model fit.
Pass the formula cache to inline.pred() in the weight models
Fix SEQOpts() argument ordering.
Bump codecov/codecov-action to v7
Report the censored/uncensored split in verbose expansion output
Fix params@data divergence so expansion uses the checked and repaired data
Fix off-by-one bootstrap model pairing in internal.hazard()
Fix the serial bootstrap seed in internal.survival() so each replicate standardizes over the same resample its outcome model was fit on
Fix "incorrect number of dimensions" error when a user column shares a name with an internal data.table variable (e.g. an outcome column named out), by hoisting row-index computations out of DT[i] expressions where columns shadow local variables
Fix fastglm.method validation to accept fastglm's full range 0-5 (0, column-pivoted QR, is fastglm's own default and was previously rejected; 4 is the full-pivoted QR and 5 the Bidiagonal Divide and Conquer SVD; all six have been supported since fastglm 0.0.1) and update the documentation accordingly
Fix the default seed so unseeded runs are genuinely random. Previously the full .Random.seed vector was stored and set.seed() silently used only its first element - the RNG kind code, a constant - so every unseeded run drew identical bootstrap resamples. The default is now a single random integer drawn when SEQopts() is called.
Fix a spurious "Maximum followup for survival curves" warning when followup.max is set while survival.max is left at its Inf default. The check now runs after the Inf defaults are resolved, which also catches the previously missed case of a finite survival.max exceeding the data-derived followup.max.
Fix show() erroring on a SEQoutput object when both subgroup and compevent are specified (the competing event section passed the model list rather than the subgroup name to cat()); empty competing event model entries are now also skipped instead of printing NULL.
Fix silent merging of bootstrap copies for large numeric subject IDs. The arithmetic relabeling (orig_id * multiplier + copy index) exceeds the 2^53 exact-integer range of doubles for roughly 8+ digit IDs, where consecutive copy indices round to the same value and distinct copies of a subject collapse together under by-ID grouping (corrupting cumulative-product weights). Relabeling now falls back to string concatenation whenever the arithmetic could overflow, or when IDs are negative or non-integer.
Fix the hazard bootstrap dropping resampling multiplicity. The hazard bootstrap sampler did not relabel bootstrap copies, so the identical copies of any subject drawn more than once collapsed under the simulation's by-(id, trial) grouping, making each replicate behave like a subsample rather than a bootstrap and understating the hazard ratio CI width. Copies are now relabeled uniquely; point estimates are unchanged.
Fix a hardcoded "_sq" suffix in the pre-expansion weight data so a custom indicator.squared no longer fails with "column not found" in the weight models.
Fix method = "dose-response" erroring with a custom indicator.squared. The expansion created dose_sq/trial_sq with hardcoded names (and excluded only the literal dose_sq from expansion variables) while the default covariates referenced paste0("dose", indicator.squared), so any non-default indicator failed in SEQexpand(). All internally generated squared columns (dose, trial, on both the expanded data and the survival prediction grid) now follow indicator.squared, matching the existing convention for followup, trial and time.
Remove dead trialID construction from the survival-curve standardization. The per-row paste0() label was built on the full standardization population (and again on every bootstrap resample) but never read: predictions there are row-wise with no by-ID grouping, so bootstrap multiplicity is carried by the duplicated rows themselves. Results are unchanged.
Vectorise the survival-curve CI clamping with pmax()/pmin() instead of evaluating scalar max()/min() once per row via by = .I. Results are unchanged.
Narrow the per-iteration copy in internal.weights() to the columns the weight models actually use (ids, structure, treatment and its baseline copy, formula covariates, censoring/visit/eligibility indicators, excused flags) instead of copying every column of the expanded (post-expansion weighting) or input (pre-expansion weighting) table on every bootstrap iteration. Results are unchanged.
Drop unmatched rows at the time-varying covariate join in SEQexpand() with nomatch = NULL (and remove a no-op .SDcols there). Original-data rows with no expansion-grid match - possible under followup.min > 0 or selection.random - were carried as NA-trial rows through the squared-column computation only to be discarded by the subsequent inner join with the baseline table. Results are unchanged.
Replace seq.int(1:.N) with seq_len(.N) in the hazard simulation's per-trial follow-up construction, avoiding a double allocation per (id, trial) group; the column is now integer, matching the expansion's convention. Results are unchanged.
Document that with glm.package = "parglm" and bootstrap = TRUE only the main fit uses parglm: the bootstrap refits always use fastglm, warm-started from the main fit's coefficients, which is faster per resample than parglm's per-fit thread setup. This was previously a silent switch.
Clarify unique vs non-unique in diagnostic table labels and docs
Soft deprecate SEQestimate() since it is only accurate to an order of magnitude.
Add bootstrap standard errors to the risk.comparison output: RD SE (standard error of the risk difference, natural scale) and log(RR) SE (standard error of the log risk ratio). Both are reported whenever bootstrap = TRUE, regardless of bootstrap.CI_method (the SEs were already computed internally to form the "se" confidence intervals but were not retained, and not computed at all under "percentile").
Fix the confidence-interval column labels in risk.comparison and risk.data, which were hardcoded as 95% regardless of bootstrap.CI. They now reflect the requested level (e.g. RD 90% LCI, 90% UCI when bootstrap.CI = 0.9).

SEQTaRget 1.4.2 (2026-05-21)

Remove mention of units from time in docs.
Improve memory usage in the bootstrapping.
Fix off-by-one labeling in survival output so that followup = k correctly represents survival after k intervals, adding a row at followup = survival.max + 1 for the final interval's estimate.
Fix expansion bug where subjects experiencing the outcome early were incorrectly carried forward with outcome=0 rows from subsequent periods by truncating each trial at the first event row (thanks, @francescazaccagnino)
Add expand.only option to SEQopts(). When TRUE, SEQuential() returns the expanded data.table directly and skips the analysis steps, for users who want to inspect or store the expanded dataset on its own.
Fix followup.spline = TRUE so the basis is genuinely non-linear. Splines are now built into the model formula via splines::ns() instead of being applied as a single-column transform of followup, and the new followup.spline.df option (default 4) controls the number of basis functions. The treatment-by-followup interaction now uses the same spline basis. Knots are baked from the full expanded followup once at fit time so the basis is identical at fit and prediction time across bootstraps and survival grids. Internally, formula column extraction now uses all.vars(), so user-supplied covariates may include ns(), bs(), I(), factor(), poly() etc. without breaking expansion.
Rename format.time() to format_time() because it wasn't an S3 method and hence was causing roxygen2 to write incorrect information in its helpfile.
Add package level helpfile and bump roxygen2 to 8.0.0.
Add parglm as an alternative GLM fitting backend.
Add warm starts for bootstrap GLM fits.
Add dataset size summary to verbose output.
Fix selection.random not being propagated from SEQopts() to internal parameters.
Cap data.table to 2 threads during tests and vignette builds, and skip the multisession parallel test on CRAN, to comply with CRAN's 2-core policy for checks.
Apply the SEQopts(nthreads = ...) setting to data.table during SEQuential(). Previously it was only used by the parglm backend and ignored in the default serial fastglm path, so data.table ran at its global default thread count. The previous global setting is restored when the call finishes.
Add risk.times option to SEQopts(). When km.curves = TRUE, risk difference and risk ratio (with CIs) are reported at each requested follow-up time, not just at the end of follow-up. Requested times are snapped to the latest available follow-up at or before them, and the final time is always included. The risk.comparison and risk.data tables gain a Followup column.
Fix factorize() to also coerce categorical (character) time-varying covariates - and their baseline (_bas) counterparts - to factors with levels fixed from the full data. Previously only fixed and treatment columns were factorized, so a character time-varying covariate could realise different level sets across bootstrap resamples and raise "newdata provided does not match fitted model" (most often in bootstrapped hazard analyses on larger samples or with a smaller bootstrap.sample). Numeric time-varying covariates are left unchanged.

SEQTaRget 1.4.1 (2026-03-31)

Strip row-level vectors from fastglm objects to reduce weight.statistics memory usage and use a new internal function to print the coefficient table.
Strip row-level vectors from outcome models before storing in @outcome.model
Fix clean_fastglm to strip row-level vectors from nested multinomial weight models
No longer store survival.curve ggplot object; regenerate on demand via km_curve()
Removed several local() wrappers and made several code optimizations.
Improved documentation of the datasets in the package.
Implement check for perfect separation when fitting logistic regression models.
Fixed a bug in and make some improvements to internal.weights().
Removed three unused slots in SEQopts().
Add alt text to figures in vignettes.
Fixed SEQuential() time.col validation detecting and repairing non-zero-indexed time.
Add validation for eligible.col values
Add Paul Madley-Dowd as a co-author
Add check for overlapping time_varying.cols and fixed.cols
Add bounds validation for numeric and integer options in SEQopts()
Add check for duplicate id/time combinations in input data
Add check that treat.level values exist in the treatment column
Add validation for excused.cols flags
Add validation for followup.min/max ordering
Add binary check for outcome.col in non-hazard analyses
Add treat.level length validation for multinomial and non-multinomial analyses
Add binary validation for cense.eligible and weight.eligible_cols
Remove additional eligibility rows if not needed
Amend defaults for followup.min and weight.lower from -Inf to 0
Fix bootstrapping for risk difference and risk ratio estimates to use paired per-iteration estimates
Optimizations to use less RAM
Fix duplicate scale_color_manual warning and plot.subtitle label bug in internal.plot()
Run doseresponse and ITT vignette chunks on GitHub Actions
Fix km_curve() returning list instead of ggplot for non-subgroup case
Fix km_curve() subtitle condition
Fix risk.comparison() CIs being NA with competing events
Move selection.random before expansion to reduce peak memory usage
Replace cbind() with := in expansion chain to avoid intermediate copy
Replace merge() with data.table native join in expansion data_list combine step
Replace rbind weight construction with copy+in-place to reduce peak memory
Drop wt and tmp columns immediately after weight is computed in all code paths
Remove redundant setDF calls in fast_model_matrix
Free WDT before bootstrap loop when data.return is FALSE
Use match(TRUE, ...) instead of which(...)[1] to find first switch/event per group
Replace sapply loop with single matrix multiply in multinomial prediction
Vectorise survival curve predictions into a single inline.pred call per treatment level
Free result list after extraction in internal_survival.R to reduce peak memory during bootstrap
Free analytic list after subgroup loop in SEQuential.R to reduce peak memory during survival curve computation
Avoid copy() in data_all construction and free data list in internal_survival.R to reduce peak memory during bootstrap
Filter to followup==0 before adding trialID in internal.survival to avoid copying entire expanded dataset
Trim base_DT to only prediction-needed columns before replication in internal.survival to reduce peak memory
Remove unnecessary copy(weight) for model.data in internal.weights since it is never modified in-place
Free baseDT after bootstrap loop in internal.survival to reduce peak memory during survival curve computation
Fix multinomial.summary: replace vcov() with fastglm $se field and add missing Coefficient column to prevent rbind mismatch
Add test_coverage.R with tests targeting uncovered code paths to increase coverage
Remove some no longer used variables and dead code
Further memory reduction optimizations

SEQTaRget 1.3.6 (2026-02-16)

Added a set.seed() call in internal.hazard() to make main estimate reproducible. And also implement fix to ensure the bootstrapping, including both standard error and percentiles, is deterministic given the seed.

SEQTaRget 1.3.5 (2026-02-05)

The hazard_ratio() function now correctly describes the estimate as "Hazard ratio"
The bootstrapping now collects the log hazard ratio instead of the hazard ratio because the log hazard ratio has better normality properties.
The covariates() function now returns more nicely formatted output (with spaces around ~ and + symbols in the model formulae)

SEQTaRget 1.3.4 (2026-01-23)

Implemented some code optimizations
- Replace a table() call with data.table's .N
- Remove all gc() calls
- Use a keyed index in bootstrapping
- Remove some uses of copy()

SEQTaRget 1.3.3 (2026-01-08)

Found and fixed a bug which caused excused switches to be overwritten.
Fix excusing override (#115)
Added visit option (#116)