JAAP - Software Framework for Evaluation, Testing ... of Authorship
Автор темы: Vito Smolej
Vito Smolej
Vito Smolej
Германия
Local time: 06:12
Член ProZ.com c 2004
английский => словенский
+ ...
ЛОКАЛИЗАТОР САЙТА
Sep 27, 2008

JGAAP is a Java-based,modular, program for textual analysis, text categorization, and authorship attribution. Funding for this project has been provided by the National Science Foundation...

... from the wiki page ...

Canonicization

* Smash Case - Converts all text in a document to lower case.
* Normalize Whitespace - Collapses all strings of whitespace characters within a document to a single space.
* Strip HTML - Converts HT
... See more
JGAAP is a Java-based,modular, program for textual analysis, text categorization, and authorship attribution. Funding for this project has been provided by the National Science Foundation...

... from the wiki page ...

Canonicization

* Smash Case - Converts all text in a document to lower case.
* Normalize Whitespace - Collapses all strings of whitespace characters within a document to a single space.
* Strip HTML - Converts HTML documents into text documents by removing HTML formatting.
* Strip Punctuation - Strips all punctuation characters from a document.

Event Sets

* Characters
* Words
* Word Lengths
* Syllables
* Event n-Grams
...

Statistical Analysis

* Cross-Entropy - Juola-Wyner Sequence Cross-Entropy Test
* Histogram Distance - L2 Metric Histogram Variance Distance
* KS Distance - Nominal Kolmogorov-Smirnov Distance
* LDA - Regularized Linear Discriminant Analysis
* SVM - Support Vector Machine with a Linear Kernel
* Gaussian SVM - Support Vector Machine with a Radial Basis Kernel


http://www.jgaap.com/

[Edited at 2008-09-27 17:19]
Collapse


 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

JAAP - Software Framework for Evaluation, Testing ... of Authorship






Anycount & Translation Office 3000
Translation Office 3000

Translation Office 3000 is an advanced accounting tool for freelance translators and small agencies. TO3000 easily and seamlessly integrates with the business life of professional freelance translators.

More info »
Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »