Document term matrix for documents sampled from two newsgroups
twoNewsGroups.Rd
A dataset containing two document term matrices for subsets of two newsgroups (rec.sport.baseball and sci.med) from the 20 newsgroups dataset.
Format
A list of two matrices, each having dimension 594 by 16214. The (i,j) entry of each matrix is the count (term frequency) of the jth word in the ith document. The first matrix in the list contains 594 sampled documents from the rec.sport.baseball newsgroup. The second contains 594 sampled documents from the sci.med newsgroup.