updated Readme
This commit is contained in:
parent
c85ce71e24
commit
701d0fdd7e
32
README.md
32
README.md
|
@ -1,22 +1,10 @@
|
|||
# Anne's Bachelor Thesis
|
||||
# Predictor for Company Mergers
|
||||
State: October 2018 (in progress)
|
||||
|
||||
My python classes for text mining, machine learning models, …
|
||||
|
||||
The scripts can be called separately.
|
||||
|
||||
Best F1 score results were:
|
||||
|
||||
SVM
|
||||
---
|
||||
F1 score: 0.8944166649330559
|
||||
best parameters set found on development set:
|
||||
{'SVC__C': 0.1, 'SVC__gamma': 0.01, 'SVC__kernel': 'linear', 'perc__percentile': 50}
|
||||
|
||||
Naive Bayes
|
||||
-----------
|
||||
parameters: SelectPercentile(25), own BOW implementation, 10-fold cross validation
|
||||
F1 score: min = 0.7586206896551724, max = 0.8846153846153846, average = 0.8324014738144634
|
||||
|
||||
The complete documentation can be found in the latex document in the thesis folder.
|
||||
|
||||
The csv file 'classification_labelled_corrected.csv' contains 1497 labeled news articles from Reuters.com and is used for the machine learning models.
|
||||
|
@ -26,10 +14,24 @@ Please enter a valid webhose personal key before you call 'Requester.py'.
|
|||
Also, please change the path to your JAVAHOME environment variable in 'NER.find_companies' method.
|
||||
|
||||
example:
|
||||
# set paths
|
||||
java_path = "C:\\Program Files (x86)\\Java\\jre1.8.0_181"
|
||||
os.environ['JAVAHOME'] = java_path
|
||||
|
||||
### Best F1 score results:
|
||||
|
||||
SVM:
|
||||
|
||||
F1 score: 0.8944166649330559
|
||||
|
||||
best parameters set found on development set:
|
||||
{'SVC__C': 0.1, 'SVC__gamma': 0.01, 'SVC__kernel': 'linear', 'perc__percentile': 50}
|
||||
|
||||
Naive Bayes:
|
||||
|
||||
parameters: SelectPercentile(25), own BOW implementation, 10-fold cross validation
|
||||
|
||||
F1 score: min = 0.7586206896551724, max = 0.8846153846153846, average = 0.8324014738144634
|
||||
|
||||
|
||||
## Requirements
|
||||
|
||||
|
|
Loading…
Reference in New Issue