- adding
relcomethod totest_anchor
- fix error in calculating sig cutoff values for CoCA (Thanks to Bianca Kang!)
- fix testing Suggests for CRAN
- minor change to output labeling for
test_anchor - updated documentation for
test_anchor - added vignette for
test_anchor
- minor change to output labeling for
test_anchor - updated documentation for
test_anchor - added vignette for
test_anchor
Added test_anchor
Added more unit tests
Fixed a bug in doc_centrality using the centroid method
Fixes for changes to the Matrix package Updating documentation and added examples
Fix encoding issue for non-ASCII characters to work with fastmatch
Add functionality
perm_testerfor Monte Carlo Permutation Tests for Model P-Valuesrancor_buildercreates random corpus based on provided term probabilitiesrancors_buildercreates multiple random corpora
Include additional tests, updated documentation and vignettes
Working on an encoding error in fastmatch which shows inconsistent behavior with non-ASCII characters. This dev version provides a temporary fix.
- Add functionality
doc_centralitycalculates four graph-based centrality metrics using DTMsdoc_similartycalculates four document similarity measures using DTMs
- Replaced dependency
- using ClusterR for
get_regions, instead of mlpack - Uses the Armadillo library k-means algorithm only (no longer provides an option)
- using ClusterR for
- Added functionality:
seq_buildercreates a token-integer sequence representation
- Added Shakespeare metadata for examples
- Import Matrix package methods
- Added functionality
dtm_builderincludes an option to return a dense base R matrixdtm_stopperincludes an option to remove based on a terms rank (e.g., top 10), stopping based on count and proportion are now two separate options
- Add functions:
find_transformation()to norm, center, and align matricesfind_projection()finds the projection matrix onto a vectorfind_rejection()finds the rejection matrix away from a vectordtm_melter()quickly turns a DTM into a triplet dataframe (doc_id, term, count)
- Fixed
get_centroid()naming (limits to single word for names)
- Added functionality to
dtm_stopper()to stop words by document or term frequencies- Nomenclature was changed,
stop_freqwas changed tostop_termfreq
- Nomenclature was changed,
- Added functionality to
dtm_resampler()to resample proportion and fixed N lengths - Added and clarified documentation
- Added a
NEWS.mdfile to track changes to the package.