A suitable corpus for training skip-though vectors

by user1767774   Last Updated May 16, 2018 14:19 PM - source

For training a variant of the notion of skip-though vectors, I need a long corpus of consecutive (related) sentences. The original skip-thought paper has used BookCorpus, but it is no longer available. Is there a similar dataset available online? I know Gutenberg project, but unfortunately its data is notoriously difficult to pre-process, and I'd also prefer more contemporary texts.


Related Questions