Here are some datasets to play with
|ERIC Abstracts||Abstracts written by the ERIC Clearinghouses on Assessment, Early Childhood and Education Management. Excellent classification success.|
|High School essays||Responses to a high school biology item. This is the data used in the JTLA paper. 2 groups.|
|Grade 5 essays||Responses to a prompt written by 5th graders. 3 score groups. Betsy does horribly with these essays (50% accuracy).|
|Grade 8 essays||Being typed.|
The essays are all in ZIPed format. Each category is its own subdirectory.
Suggestions are welcome. Again, please keep me posted of your results using BETSY - Larry Rudner firstname.lastname@example.org