Text Dataset Detail

DE459A9E-7018-469E-974F-10CAC01F033A Text Corpus

EnglishTokens: 50M tokensOpen Source: 100k entries

Dataset Information

License / Open Source NoticeOpen Source Training Datasets Terms of Use apply to this dataset.
Content TypeNews, Blogs, Conversations
Tokens50M tokens
Data ContentDiverse text spanning multiple domains and topics.
File FormatTXT, JSON
Field of ApplicationNLP
Data Sensitive Itemsnil
Copyright OwnerMagic Data

Sample

MDT-SAMPLE Dummy Text Corpus
Sample
Open-Source View : 1234 English
Industry: Financial Services
Application: Document Classification
Type: Training Set
Region: USA

Related Datasets

MDT-TX-002 Mandarin Chinese Text Corpus
View Detail
Open-Source View : 1234 English
Industry: Financial Services
Application: Document Classification
Type: Training Set
Region: USA
MDT-TX-003 German Text Corpus
View Detail
Open-Source View : 1234 Mandarin Chinese
Industry: Financial Services
Application: Document Classification
Type: Training Set
Region: USA