420-LCW-MS Programming Techniques and Applications - Assignment 3 April 23, 2020 As always, remember the general requirements for assignments in this course! Goals for this assignment: • Experiment...

1 answer below »
This file is the assignment, and i am completely clueless about how to do it


420-LCW-MS Programming Techniques and Applications - Assignment 3 April 23, 2020 As always, remember the general requirements for assignments in this course! Goals for this assignment: • Experiment with a machine learning algorithm. • Learn about different measures of the quality of a classifier. Introduction In basic machine learning exercises it is common to examine percentage of “correct” predictions made by a classifier. This is easy to compute, in that you can just count the number of times the classifier agrees with the “known” label, and divide this by the total number of items classified. In practice, this simple measure isn’t always enough. For example, in a simple two-class problem, one class may out- number the other. Suppose our goal is to detect a disease that occurs in only 1% of the population, and our dataset reflects the distribution of measurements and classifications in the overall population. So only 1% of our data will have be labeled as positive for the disease, and the rest will be labeled negative. In this case, a dumb classifier that just outputs the negative prediction (no disease) in every case will be right 99% of the time. This kind of imbalance in the training data is one of the important potential sources of “bias” in our model, in the common sense of the word “bias”. To get a better idea of the performance of the algorithm in a classification problem with C classes, we want to break down our errors into C2 different “bins”. In this exercise we’ll only worry about a 2-class problem, so we’ll have four bins to fill. The bins should record the number of times we observe one of the following things: 1. True positives (TP) - the classifier predicted a correct positive label. 2. True negatives (TN) - the classifier predicted a correct negative label. 3. False positives (FP) - the classifier predicted an incorrect positive label. 4. False negatives (FN) - the classifier predicted an incorrect negative label. It is common to arrange these counts in a C by C matrix called the confusion matrix: Correct 0 1 Predicted 0 TN FN1 FP TP From these four values, you can compute a number of possible measures of the quality of your classifier. In this exercise you’ll just compute a few of the most useful. The first, the sensitivity, also known as both recall or true positive rate, is computed as: T PR = T P T P + FN The second, the specificity, or true negative rate is computed as: T NR = T N T N + FP The third, the false positive rate, is: FPR = FP FP + T N = 1 − T NR There are other combinations that can be useful, for more information the Wikipedia page on the topic Confusion matrix is worth reading. 1 https://en.wikipedia.org/wiki/Confusion_matrix The dataset For this exercise, you’ll use a dataset called “Spambase” that has been used for several years as a benchmark for machine learning algorithms that detect spam email messages. The data was downloaded from the UCI ML repository and is described in the UCI repository page. The datasets consists of 4601 instances of 57 attributes each, with a class label that is either 0 for “good” and 1 for “spam”. The attributes are all numeric and reflect various things like word or character counts in the message. Obviously this is a case where it is quite likely that the number of spam and non-spam emails is unequal, so performing a detailed evaluation of the classifier is important. The distinction is somewhat arbitrary, but in this case think of the decision to classify a message as spam as the “positive” classification. Incorrectly classifying good emails as spam may be quite bothersome to users, so we typically want to minimize the false positive rate. However, minimizing the FPR will trade off with the true positive rate of the classifier - reducing the FPR will often reduce the the TPR as well, letting spam messages through the spam filter. The included files I have included many files you can use as the basis of your work: • bagging.py - A simple implementation of the “bagging” approach to random forests. This implements the bagging_trees classifier. • classifier.py - An abstract class that implements a generic interface for classification. This also includes some global functions which are useful in classifiers generally. • datasets.py - Some code to read a few of the example labeled datasets. • decision_tree.py - A very basic implementation of a decision tree classifier. • extra_trees.py - An implementation of the “extremely randomized” trees classifier. • Geometry.py - A few simple classes used for visualization and geometric objects. • kdtree.py - An implementation of K-dimensional (K-D) trees for relatively fast nearest-neighbor computation. • knnclassifier.py - Simple k-nearest neighbor classification using K-D trees. • perceptron.py - Implementation of the classic “perceptron”, a predecessor of true neural networks. • simple_nn.py - A simple backpropagation neural network. Hopelessly primitive by today’s standards, but demon- strating many of the same basic principles still in use. • spambase.py - Code to read the spambase.data file. Your tasks 1. Using the basic code provided in spambase.py, add code to split the dataset into training and testing folds using 5-fold random sub-sampling cross validation. This is not exactly the same procedure as used in the datasets.py function evaluate, but you can use that as a guide. To do random sub-sampling validation, reshuffle the dataset on every fold, and split it into two pieces, one containing 4/5 of the data (the training set) and the other containing 1/5 of the data (the test set). Since there are 4601 items in the dataset, you should have 920 items in your test set on each fold and 3681 items in the training set. Do not use these numbers directly in your code, compute them based on the length of the dataset. 2. For each fold, create a classifier. I recommend Extra trees, but feel free to try a different method1. Train the classifier on the training set, then compute its predictions for the test set. Remember that all of the classifiers used follow the framework defined in classifier.py - they have two public methods, train() and predict(). Also keep in mind that this is a relatively large dataset, so it will take a while to train and test 5 folds. You can experiment with a smaller number of folds just to get your code running. 1If you are super ambitious, try to use a “real” algorithm from scikit-learn, tensorflow, or similar. 2 http://archive.ics.uci.edu/ml/datasets/Spambase https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Repeated_random_sub-sampling_validation 3. Compute the overall confusion matrix by combining all results over all of your folds. 4. Compute and print the TPR and FPR. 5. Experiment with different settings for your chosen classifier. For example, with extra trees you can try varying the number of trees M, the minimum split size Nmin, or the number of tests evaluated per node, K. Does changing these values have a big effect on your results? Vary these parameters systematically (pick at least 4 or 5 different values or sets of values) and produce a graph or a table. 6. Use a second classifier and compare results. For time reasons, you might want to try k-Nearest-Neighbor as your comparison. If you do use kNN, report how your results change if you vary k. 7. Create a 1-page document (text, PDF, or Word) that summarizes your results. What to hand in Hand in any Python code you create or modify, as well as a brief (at most one page) summary of your results. Combine it all into a ZIP file and upload to Omnivox. 3
Answered Same DayMay 02, 2021

Answer To: 420-LCW-MS Programming Techniques and Applications - Assignment 3 April 23, 2020 As always, remember...

Ximi answered on May 04 2021
144 Votes
__MACOSX/._a3-2-1-zgpisn1p-3nmaq3oe
a3-2-1-zgpisn1p-3nmaq3oe/iris.txt
# Data file from the classic paper:
#
# Fisher, R. A. "The use of multiple measurements in taxonomic problems"
# Annual Eugenics, 7, Part II, 179-188 (1936)
#
# Fields correspond to the sepal length, sepal width, petal length,
# petal width, and latin name of the three species of iris.
#
# All lengths and widths are in centimeters.
#
# Data file downloaded from the UCI Machine Learning repository, with
# the listed corrections applied.
#
5.1 3.5 1.4 0.2 Iris-setosa
4.9 3.0 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.2 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
5.0 3.6 1.4 0.2 Iris-setosa
5.4 3.9 1.7 0.4 Iris-setosa
4.6 3.4 1.4 0.3 Iris-setosa
5.0 3.4 1.5 0.2 Iris-setosa
4.4 2.9 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa
5.4 3.7 1.5 0.2 Iris-setosa
4.8 3.4 1.6 0.2 Iris-setosa
4.8 3.0 1.4 0.1 Iris-setosa
4.3 3.0 1.1 0.1 Iris-setosa
5.8 4.0 1.2 0.2 Iris-setosa
5.7 4.4 1.5 0.4 Iris-setosa
5.4 3.9 1.3 0.4 Iris-setosa
5.1 3.5 1.4 0.3 Iris-setosa
5.7 3.8 1.7 0.3 Iris-setosa
5.1 3.8 1.5 0.3 Iris-setosa
5.4 3.4 1.7 0.2 Iris-setosa
5.1 3.7 1.5 0.4 Iris-setosa
4.6 3.6 1.0 0.2 Iris-setosa
5.1 3.3 1.7 0.5 Iris-setosa
4.8 3.4 1.9 0.2 Iris-setosa
5.0 3.0 1.6 0.2 Iris-setosa
5.0 3.4 1.6 0.4 Iris-setosa
5.2 3.5 1.5 0.2 Iris-setosa
5.2 3.4 1.4 0.2 Iris-setosa
4.7 3.2 1.6 0.2 Iris-setosa
4.8 3.1 1.6 0.2 Iris-setosa
5.4 3.4 1.5 0.4 Iris-setosa
5.2 4.1 1.5 0.1 Iris-setosa
5.5 4.2 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.2 Iris-setosa
5.0 3.2 1.2 0.2 Iris-setosa
5.5 3.5 1.3 0.2 Iris-setosa
4.9 3.6 1.4 0.1 Iris-setosa
4.4 3.0 1.3 0.2 Iris-setosa
5.1 3.4 1.5 0.2 Iris-setosa
5.0 3.5 1.3 0.3 Iris-setosa
4.5 2.3 1.3 0.3 Iris-setosa
4.4 3.2 1.3 0.2 Iris-setosa
5.0 3.5 1.6 0.6 Iris-setosa
5.1 3.8 1.9 0.4 Iris-setosa
4.8 3.0 1.4 0.3 Iris-setosa
5.1 3.8 1.6 0.2 Iris-setosa
4.6 3.2 1.4 0.2 Iris-setosa
5.3 3.7 1.5 0.2 Iris-setosa
5.0 3.3 1.4 0.2 Iris-setosa
7.0 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
5.5 2.3 4.0 1.3 Iris-versicolor
6.5 2.8 4.6 1.5 Iris-versicolor
5.7 2.8 4.5 1.3 Iris-versicolor
6.3 3.3 4.7 1.6 Iris-versicolor
4.9 2.4 3.3 1.0 Iris-versicolor
6.6 2.9 4.6 1.3 Iris-versicolor
5.2 2.7 3.9 1.4 Iris-versicolor
5.0 2.0 3.5 1.0 Iris-versicolor
5.9 3.0 4.2 1.5 Iris-versicolor
6.0 2.2 4.0 1.0 Iris-versicolor
6.1 2.9 4.7 1.4 Iris-versicolor
5.6 2.9 3.6 1.3 Iris-versicolor
6.7 3.1 4.4 1.4 Iris-versicolor
5.6 3.0 4.5 1.5 Iris-versicolor
5.8 2.7 4.1 1.0 Iris-versicolor
6.2 2.2 4.5 1.5 Iris-versicolor
5.6 2.5 3.9 1.1 Iris-versicolor
5.9 3.2 4.8 1.8 Iris-versicolor
6.1 2.8 4.0 1.3 Iris-versicolor
6.3 2.5 4.9 1.5 Iris-versicolor
6.1 2.8 4.7 1.2 Iris-versicolor
6.4 2.9 4.3 1.3 Iris-versicolor
6.6 3.0 4.4 1.4 Iris-versicolor
6.8 2.8 4.8 1.4 Iris-versicolor
6.7 3.0 5.0 1.7 Iris-versicolor
6.0 2.9 4.5 1.5 Iris-versicolor
5.7 2.6 3.5 1.0 Iris-versicolor
5.5 2.4 3.8 1.1 Iris-versicolor
5.5 2.4 3.7 1.0 Iris-versicolor
5.8 2.7 3.9 1.2 Iris-versicolor
6.0 2.7 5.1 1.6 Iris-versicolor
5.4 3.0 4.5 1.5 Iris-versicolor
6.0 3.4 4.5 1.6 Iris-versicolor
6.7 3.1 4.7 1.5 Iris-versicolor
6.3 2.3 4.4 1.3 Iris-versicolor
5.6 3.0 4.1 1.3 Iris-versicolor
5.5 2.5 4.0 1.3 Iris-versicolor
5.5 2.6 4.4 1.2 Iris-versicolor
6.1 3.0 4.6 1.4 Iris-versicolor
5.8 2.6 4.0 1.2 Iris-versicolor
5.0 2.3 3.3 1.0 Iris-versicolor
5.6 2.7 4.2 1.3 Iris-versicolor
5.7 3.0 4.2 1.2 Iris-versicolor
5.7 2.9 4.2 1.3 Iris-versicolor
6.2 2.9 4.3 1.3 Iris-versicolor
5.1 2.5 3.0 1.1 Iris-versicolor
5.7 2.8 4.1 1.3 Iris-versicolor
6.3 3.3 6.0 2.5 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
7.1 3.0 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
6.5 3.0 5.8 2.2 Iris-virginica
7.6 3.0 6.6 2.1 Iris-virginica
4.9 2.5 4.5 1.7 Iris-virginica
7.3 2.9 6.3 1.8 Iris-virginica
6.7 2.5 5.8 1.8 Iris-virginica
7.2 3.6 6.1 2.5 Iris-virginica
6.5 3.2 5.1 2.0 Iris-virginica
6.4 2.7 5.3 1.9 Iris-virginica
6.8 3.0 5.5 2.1 Iris-virginica
5.7 2.5 5.0 2.0 Iris-virginica
5.8 2.8 5.1 2.4 Iris-virginica
6.4 3.2 5.3 2.3 Iris-virginica
6.5 3.0 5.5 1.8 Iris-virginica
7.7 3.8 6.7 2.2 Iris-virginica
7.7 2.6 6.9 2.3 Iris-virginica
6.0 2.2 5.0 1.5 Iris-virginica
6.9 3.2 5.7 2.3 Iris-virginica
5.6 2.8 4.9 2.0 Iris-virginica
7.7 2.8 6.7 2.0 Iris-virginica
6.3 2.7 4.9 1.8 Iris-virginica
6.7 3.3 5.7 2.1 Iris-virginica
7.2 3.2 6.0 1.8 Iris-virginica
6.2 2.8 4.8 1.8 Iris-virginica
6.1 3.0 4.9 1.8 Iris-virginica
6.4 2.8 5.6 2.1 Iris-virginica
7.2 3.0 5.8 1.6 Iris-virginica
7.4 2.8 6.1 1.9 Iris-virginica
7.9 3.8 6.4 2.0 Iris-virginica
6.4 2.8 5.6 2.2 Iris-virginica
6.3 2.8 5.1 1.5 Iris-virginica
6.1 2.6 5.6 1.4 Iris-virginica
7.7 3.0 6.1 2.3 Iris-virginica
6.3 3.4 5.6 2.4 Iris-virginica
6.4 3.1 5.5 1.8 Iris-virginica
6.0 3.0 4.8 1.8 Iris-virginica
6.9 3.1 5.4 2.1 Iris-virginica
6.7 3.1 5.6 2.4 Iris-virginica
6.9 3.1 5.1 2.3 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
6.8 3.2 5.9 2.3 Iris-virginica
6.7 3.3 5.7 2.5 Iris-virginica
6.7 3.0 5.2 2.3 Iris-virginica
6.3 2.5 5.0 1.9 Iris-virginica
6.5 3.0 5.2 2.0 Iris-virginica
6.2 3.4 5.4 2.3 Iris-virginica
5.9 3.0 5.1 1.8 Iris-virginica
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._iris.txt
a3-2-1-zgpisn1p-3nmaq3oe/a3.pdf
420-LCW-MS Programming Techniques and Applications - Assignment 3
April 23, 2020
As always, remember the general requirements for assignments in this course!
Goals for this assignment:
• Experiment with a machine learning algorithm.
• Learn about different measures of the quality of a classifier.
Introduction
In basic machine learning exercises it is common to examine percentage of “correct” predictions made by a classifier. This is
easy to compute, in that you can just count the number of times the classifier agrees with the “known” label, and divide this
by the total number of items classified.
In practice, this simple measure isn’t always enough. For example, in a simple two-class problem, one class may out-
number the other. Suppose our goal is to detect a disease that occurs in only 1% of the population, and our dataset reflects
the distribution of measurements and classifications in the overall population. So only 1% of our data will have be labeled
as positive for the disease, and the rest will be labeled negative. In this case, a dumb classifier that just outputs the negative
prediction (no disease) in every case will be right 99% of the time. This kind of imbalance in the training data is one of the
important potential sources of “bias” in our model, in the common sense of the word “bias”.
To get a better idea of the performance of the algorithm in a classification problem with C classes, we want to break down
our errors into C2 different “bins”. In this exercise we’ll only worry about a 2-class problem, so we’ll have four bins to fill.
The bins should record the number of times we observe one of the following things:
1. True positives (TP) - the classifier predicted a correct positive label.
2. True negatives (TN) - the classifier predicted a correct negative label.
3. False positives (FP) - the classifier predicted an incorrect positive label.
4. False negatives (FN) - the classifier predicted an incorrect negative label.
It is common to arrange these counts in a C by C matrix called the confusion matrix:
Correct
0 1
Predicted 0 TN FN1 FP TP
From these four values, you can compute a number of possible measures of the quality of your classifier. In this exercise
you’ll just compute a few of the most useful.
The first, the sensitivity, also known as both recall or true positive rate, is computed as:
T PR =
T P
T P + FN
The second, the specificity, or true negative rate is computed as:
T NR =
T N
T N + FP
The third, the false positive rate, is:
FPR =
FP
FP + T N
= 1 − T NR
There are other combinations that can be useful, for more information the Wikipedia page on the topic Confusion matrix
is worth reading.
1
https://en.wikipedia.org/wiki/Confusion_matrix
The dataset
For this exercise, you’ll use a dataset called “Spambase” that has been used for several years as a benchmark for machine
learning algorithms that detect spam email messages. The data was downloaded from the UCI ML repository and is described
in the UCI repository page. The datasets consists of 4601 instances of 57 attributes each, with a class label that is either 0
for “good” and 1 for “spam”. The attributes are all numeric and reflect various things like word or character counts in the
message.
Obviously this is a case where it is quite likely that the number of spam and non-spam emails is unequal, so performing a
detailed evaluation of the classifier is important. The distinction is somewhat arbitrary, but in this case think of the decision
to classify a message as spam as the “positive” classification. Incorrect
ly classifying good emails as spam may be quite
bothersome to users, so we typically want to minimize the false positive rate. However, minimizing the FPR will trade off
with the true positive rate of the classifier - reducing the FPR will often reduce the the TPR as well, letting spam messages
through the spam filter.
The included files
I have included many files you can use as the basis of your work:
• bagging.py - A simple implementation of the “bagging” approach to random forests. This implements the bagging_trees
classifier.
• classifier.py - An abstract class that implements a generic interface for classification. This also includes some
global functions which are useful in classifiers generally.
• datasets.py - Some code to read a few of the example labeled datasets.
• decision_tree.py - A very basic implementation of a decision tree classifier.
• extra_trees.py - An implementation of the “extremely randomized” trees classifier.
• Geometry.py - A few simple classes used for visualization and geometric objects.
• kdtree.py - An implementation of K-dimensional (K-D) trees for relatively fast nearest-neighbor computation.
• knnclassifier.py - Simple k-nearest neighbor classification using K-D trees.
• perceptron.py - Implementation of the classic “perceptron”, a predecessor of true neural networks.
• simple_nn.py - A simple backpropagation neural network. Hopelessly primitive by today’s standards, but demon-
strating many of the same basic principles still in use.
• spambase.py - Code to read the spambase.data file.
Your tasks
1. Using the basic code provided in spambase.py, add code to split the dataset into training and testing folds using 5-fold
random sub-sampling cross validation. This is not exactly the same procedure as used in the datasets.py function
evaluate, but you can use that as a guide.
To do random sub-sampling validation, reshuffle the dataset on every fold, and split it into two pieces, one containing
4/5 of the data (the training set) and the other containing 1/5 of the data (the test set).
Since there are 4601 items in the dataset, you should have 920 items in your test set on each fold and 3681 items in the
training set. Do not use these numbers directly in your code, compute them based on the length of the dataset.
2. For each fold, create a classifier. I recommend Extra trees, but feel free to try a different method1. Train the classifier
on the training set, then compute its predictions for the test set. Remember that all of the classifiers used follow the
framework defined in classifier.py - they have two public methods, train() and predict(). Also keep in mind
that this is a relatively large dataset, so it will take a while to train and test 5 folds. You can experiment with a smaller
number of folds just to get your code running.
1If you are super ambitious, try to use a “real” algorithm from scikit-learn, tensorflow, or similar.
2
http://archive.ics.uci.edu/ml/datasets/Spambase
https://en.wikipedia.org/wiki/Cross-validation_(statistics)#Repeated_random_sub-sampling_validation
3. Compute the overall confusion matrix by combining all results over all of your folds.
4. Compute and print the TPR and FPR.
5. Experiment with different settings for your chosen classifier. For example, with extra trees you can try varying the
number of trees M, the minimum split size Nmin, or the number of tests evaluated per node, K. Does changing these
values have a big effect on your results? Vary these parameters systematically (pick at least 4 or 5 different values or
sets of values) and produce a graph or a table.
6. Use a second classifier and compare results. For time reasons, you might want to try k-Nearest-Neighbor as your
comparison. If you do use kNN, report how your results change if you vary k.
7. Create a 1-page document (text, PDF, or Word) that summarizes your results.
What to hand in
Hand in any Python code you create or modify, as well as a brief (at most one page) summary of your results. Combine it all
into a ZIP file and upload to Omnivox.
3
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._a3.pdf
a3-2-1-zgpisn1p-3nmaq3oe/spambase.README.txt
1. Title: SPAM E-mail Database
2. Sources:
(a) Creators: Mark Hopkins, Erik Reeber, George Forman, Jaap Suermondt
Hewlett-Packard Labs, 1501 Page Mill Rd., Palo Alto, CA 94304
(b) Donor: George Forman (gforman at nospam hpl.hp.com) 650-857-7835
(c) Generated: June-July 1999
3. Past Usage:
(a) Hewlett-Packard Internal-only Technical Report. External forthcoming.
(b) Determine whether a given email is spam or not.
(c) ~7% misclassification error.
False positives (marking good mail as spam) are very undesirable.
If we insist on zero false positives in the training/testing set,
20-25% of the spam passed through the filter.
4. Relevant Information:
The "spam" concept is diverse: advertisements for products/web
sites, make money fast schemes, chain letters, pornography...
    Our collection of spam e-mails came from our postmaster and
    individuals who had filed spam. Our collection of non-spam
    e-mails came from filed work and personal e-mails, and hence
    the word 'george' and the area code '650' are indicators of
    non-spam. These are useful when constructing a personalized
    spam filter. One would either have to blind such non-spam
    indicators or get a very wide collection of non-spam to
    generate a general purpose spam filter.
For background on spam:
Cranor, Lorrie F., LaMacchia, Brian A. Spam!
Communications of the ACM, 41(8):74-83, 1998.
5. Number of Instances: 4601 (1813 Spam = 39.4%)
6. Number of Attributes: 58 (57 continuous, 1 nominal class label)
7. Attribute Information:
The last column of 'spambase.data' denotes whether the e-mail was
considered spam (1) or not (0), i.e. unsolicited commercial e-mail.
Most of the attributes indicate whether a particular word or
character was frequently occuring in the e-mail. The run-length
attributes (55-57) measure the length of sequences of consecutive
capital letters. For the statistical measures of each attribute,
see the end of this file. Here are the definitions of the attributes:
48 continuous real [0,100] attributes of type word_freq_WORD
= percentage of words in the e-mail that match WORD,
i.e. 100 * (number of times the WORD appears in the e-mail) /
total number of words in e-mail. A "word" in this case is any
string of alphanumeric characters bounded by non-alphanumeric
characters or end-of-string.
6 continuous real [0,100] attributes of type char_freq_CHAR
= percentage of characters in the e-mail that match CHAR,
i.e. 100 * (number of CHAR occurences) / total characters in e-mail
1 continuous real [1,...] attribute of type capital_run_length_average
= average length of uninterrupted sequences of capital letters
1 continuous integer [1,...] attribute of type capital_run_length_longest
= length of longest uninterrupted sequence of capital letters
1 continuous integer [1,...] attribute of type capital_run_length_total
= sum of length of uninterrupted sequences of capital letters
= total number of capital letters in the e-mail
1 nominal {0,1} class attribute of type spam
= denotes whether the e-mail was considered spam (1) or not (0),
i.e. unsolicited commercial e-mail.
8. Missing Attribute Values: None
9. Class Distribution:
    Spam     1813 (39.4%)
    Non-Spam 2788 (60.6%)
Attribute Statistics:
Min: Max: Average: Std.Dev: Coeff.Var_%:
1 0 4.54 0.10455 0.30536 292
2 0 14.28 0.21301 1.2906 606
3 0 5.1 0.28066 0.50414 180
4 0 42.81 0.065425 1.3952 2130
5 0 10 0.31222 0.67251 215
6 0 5.88 0.095901 0.27382 286
7 0 7.27 0.11421 0.39144 343
8 0 11.11 0.10529 0.40107 381
9 0 5.26 0.090067 0.27862 309
10 0 18.18 0.23941 0.64476 269
11 0 2.61 0.059824 0.20154 337
12 0 9.67 0.5417 0.8617 159
13 0 5.55 0.09393 0.30104 320
14 0 10 0.058626 0.33518 572
15 0 4.41 0.049205 0.25884 526
16 0 20 0.24885 0.82579 332
17 0 7.14 0.14259 0.44406 311
18 0 9.09 0.18474 0.53112 287
19 0 18.75 1.6621 1.7755 107
20 0 18.18 0.085577 0.50977 596
21 0 11.11 0.80976 1.2008 148
22 0 17.1 0.1212 1.0258 846
23 0 5.45 0.10165 0.35029 345
24 0 12.5 0.094269 0.44264 470
25 0 20.83 0.5495 1.6713 304
26 0 16.66 0.26538 0.88696 334
27 0 33.33 0.7673 3.3673 439
28 0 9.09 0.12484 0.53858 431
29 0 14.28 0.098915 0.59333 600
30 0 5.88 0.10285 0.45668 444
31 0 12.5 0.064753 0.40339 623
32 0 4.76 0.047048 0.32856 698
33 0 18.18 0.097229 0.55591 572
34 0 4.76 0.047835 0.32945 689
35 0 20 0.10541 0.53226 505
36 0 7.69 0.097477 0.40262 413
37 0 6.89 0.13695 0.42345 309
38 0 8.33 0.013201 0.22065 1670
39 0 11.11 0.078629 0.43467 553
40 0 4.76 0.064834 0.34992 540
41 0 7.14 0.043667 0.3612 827
42 0 14.28 0.13234 0.76682 579
43 0 3.57 0.046099 0.22381 486
44 0 20 0.079196 0.62198 785
45 0 21.42 0.30122 1.0117 336
46 0 22.05 0.17982 0.91112 507
47 0 2.17 0.0054445 0.076274 1400
48 0 10 0.031869 0.28573 897
49 0 4.385 0.038575 0.24347 631
50 0 9.752 0.13903 0.27036 194
51 0 4.081 0.016976 0.10939 644
52 0 32.478 0.26907 0.81567 303
53 0 6.003 0.075811 0.24588 324
54 0 19.829 0.044238 0.42934 971
55 1 1102.5 5.1915 31.729 611
56 1 9989 52.173 194.89 374
57 1 15841 283.29 606.35 214
58 0 1 0.39404 0.4887 124
This file: 'spambase.DOCUMENTATION' at the UCI Machine Learning Repository
http://www.ics.uci.edu/~mlearn/MLRepository.html
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._spambase.README.txt
a3-2-1-zgpisn1p-3nmaq3oe/parkinsons.data
name,MDVP:Fo(Hz),MDVP:Fhi(Hz),MDVP:Flo(Hz),MDVP:Jitter(%),MDVP:Jitter(Abs),MDVP:RAP,MDVP:PPQ,Jitter:DDP,MDVP:Shimmer,MDVP:Shimmer(dB),Shimmer:APQ3,Shimmer:APQ5,MDVP:APQ,Shimmer:DDA,NHR,HNR,status,RPDE,DFA,spread1,spread2,D2,PPE
phon_R01_S01_1,119.99200,157.30200,74.99700,0.00784,0.00007,0.00370,0.00554,0.01109,0.04374,0.42600,0.02182,0.03130,0.02971,0.06545,0.02211,21.03300,1,0.414783,0.815285,-4.813031,0.266482,2.301442,0.284654
phon_R01_S01_2,122.40000,148.65000,113.81900,0.00968,0.00008,0.00465,0.00696,0.01394,0.06134,0.62600,0.03134,0.04518,0.04368,0.09403,0.01929,19.08500,1,0.458359,0.819521,-4.075192,0.335590,2.486855,0.368674
phon_R01_S01_3,116.68200,131.11100,111.55500,0.01050,0.00009,0.00544,0.00781,0.01633,0.05233,0.48200,0.02757,0.03858,0.03590,0.08270,0.01309,20.65100,1,0.429895,0.825288,-4.443179,0.311173,2.342259,0.332634
phon_R01_S01_4,116.67600,137.87100,111.36600,0.00997,0.00009,0.00502,0.00698,0.01505,0.05492,0.51700,0.02924,0.04005,0.03772,0.08771,0.01353,20.64400,1,0.434969,0.819235,-4.117501,0.334147,2.405554,0.368975
phon_R01_S01_5,116.01400,141.78100,110.65500,0.01284,0.00011,0.00655,0.00908,0.01966,0.06425,0.58400,0.03490,0.04825,0.04465,0.10470,0.01767,19.64900,1,0.417356,0.823484,-3.747787,0.234513,2.332180,0.410335
phon_R01_S01_6,120.55200,131.16200,113.78700,0.00968,0.00008,0.00463,0.00750,0.01388,0.04701,0.45600,0.02328,0.03526,0.03243,0.06985,0.01222,21.37800,1,0.415564,0.825069,-4.242867,0.299111,2.187560,0.357775
phon_R01_S02_1,120.26700,137.24400,114.82000,0.00333,0.00003,0.00155,0.00202,0.00466,0.01608,0.14000,0.00779,0.00937,0.01351,0.02337,0.00607,24.88600,1,0.596040,0.764112,-5.634322,0.257682,1.854785,0.211756
phon_R01_S02_2,107.33200,113.84000,104.31500,0.00290,0.00003,0.00144,0.00182,0.00431,0.01567,0.13400,0.00829,0.00946,0.01256,0.02487,0.00344,26.89200,1,0.637420,0.763262,-6.167603,0.183721,2.064693,0.163755
phon_R01_S02_3,95.73000,132.06800,91.75400,0.00551,0.00006,0.00293,0.00332,0.00880,0.02093,0.19100,0.01073,0.01277,0.01717,0.03218,0.01070,21.81200,1,0.615551,0.773587,-5.498678,0.327769,2.322511,0.231571
phon_R01_S02_4,95.05600,120.10300,91.22600,0.00532,0.00006,0.00268,0.00332,0.00803,0.02838,0.25500,0.01441,0.01725,0.02444,0.04324,0.01022,21.86200,1,0.547037,0.798463,-5.011879,0.325996,2.432792,0.271362
phon_R01_S02_5,88.33300,112.24000,84.07200,0.00505,0.00006,0.00254,0.00330,0.00763,0.02143,0.19700,0.01079,0.01342,0.01892,0.03237,0.01166,21.11800,1,0.611137,0.776156,-5.249770,0.391002,2.407313,0.249740
phon_R01_S02_6,91.90400,115.87100,86.29200,0.00540,0.00006,0.00281,0.00336,0.00844,0.02752,0.24900,0.01424,0.01641,0.02214,0.04272,0.01141,21.41400,1,0.583390,0.792520,-4.960234,0.363566,2.642476,0.275931
phon_R01_S04_1,136.92600,159.86600,131.27600,0.00293,0.00002,0.00118,0.00153,0.00355,0.01259,0.11200,0.00656,0.00717,0.01140,0.01968,0.00581,25.70300,1,0.460600,0.646846,-6.547148,0.152813,2.041277,0.138512
phon_R01_S04_2,139.17300,179.13900,76.55600,0.00390,0.00003,0.00165,0.00208,0.00496,0.01642,0.15400,0.00728,0.00932,0.01797,0.02184,0.01041,24.88900,1,0.430166,0.665833,-5.660217,0.254989,2.519422,0.199889
phon_R01_S04_3,152.84500,163.30500,75.83600,0.00294,0.00002,0.00121,0.00149,0.00364,0.01828,0.15800,0.01064,0.00972,0.01246,0.03191,0.00609,24.92200,1,0.474791,0.654027,-6.105098,0.203653,2.125618,0.170100
phon_R01_S04_4,142.16700,217.45500,83.15900,0.00369,0.00003,0.00157,0.00203,0.00471,0.01503,0.12600,0.00772,0.00888,0.01359,0.02316,0.00839,25.17500,1,0.565924,0.658245,-5.340115,0.210185,2.205546,0.234589
phon_R01_S04_5,144.18800,349.25900,82.76400,0.00544,0.00004,0.00211,0.00292,0.00632,0.02047,0.19200,0.00969,0.01200,0.02074,0.02908,0.01859,22.33300,1,0.567380,0.644692,-5.440040,0.239764,2.264501,0.218164
phon_R01_S04_6,168.77800,232.18100,75.60300,0.00718,0.00004,0.00284,0.00387,0.00853,0.03327,0.34800,0.01441,0.01893,0.03430,0.04322,0.02919,20.37600,1,0.631099,0.605417,-2.931070,0.434326,3.007463,0.430788
phon_R01_S05_1,153.04600,175.82900,68.62300,0.00742,0.00005,0.00364,0.00432,0.01092,0.05517,0.54200,0.02471,0.03572,0.05767,0.07413,0.03160,17.28000,1,0.665318,0.719467,-3.949079,0.357870,3.109010,0.377429
phon_R01_S05_2,156.40500,189.39800,142.82200,0.00768,0.00005,0.00372,0.00399,0.01116,0.03995,0.34800,0.01721,0.02374,0.04310,0.05164,0.03365,17.15300,1,0.649554,0.686080,-4.554466,0.340176,2.856676,0.322111
phon_R01_S05_3,153.84800,165.73800,65.78200,0.00840,0.00005,0.00428,0.00450,0.01285,0.03810,0.32800,0.01667,0.02383,0.04055,0.05000,0.03871,17.53600,1,0.660125,0.704087,-4.095442,0.262564,2.739710,0.365391
phon_R01_S05_4,153.88000,172.86000,78.12800,0.00480,0.00003,0.00232,0.00267,0.00696,0.04137,0.37000,0.02021,0.02591,0.04525,0.06062,0.01849,19.49300,1,0.629017,0.698951,-5.186960,0.237622,2.557536,0.259765
phon_R01_S05_5,167.93000,193.22100,79.06800,0.00442,0.00003,0.00220,0.00247,0.00661,0.04351,0.37700,0.02228,0.02540,0.04246,0.06685,0.01280,22.46800,1,0.619060,0.679834,-4.330956,0.262384,2.916777,0.285695
phon_R01_S05_6,173.91700,192.73500,86.18000,0.00476,0.00003,0.00221,0.00258,0.00663,0.04192,0.36400,0.02187,0.02470,0.03772,0.06562,0.01840,20.42200,1,0.537264,0.686894,-5.248776,0.210279,2.547508,0.253556
phon_R01_S06_1,163.65600,200.84100,76.77900,0.00742,0.00005,0.00380,0.00390,0.01140,0.01659,0.16400,0.00738,0.00948,0.01497,0.02214,0.01778,23.83100,1,0.397937,0.732479,-5.557447,0.220890,2.692176,0.215961
phon_R01_S06_2,104.40000,206.00200,77.96800,0.00633,0.00006,0.00316,0.00375,0.00948,0.03767,0.38100,0.01732,0.02245,0.03780,0.05197,0.02887,22.06600,1,0.522746,0.737948,-5.571843,0.236853,2.846369,0.219514
phon_R01_S06_3,171.04100,208.31300,75.50100,0.00455,0.00003,0.00250,0.00234,0.00750,0.01966,0.18600,0.00889,0.01169,0.01872,0.02666,0.01095,25.90800,1,0.418622,0.720916,-6.183590,0.226278,2.589702,0.147403
phon_R01_S06_4,146.84500,208.70100,81.73700,0.00496,0.00003,0.00250,0.00275,0.00749,0.01919,0.19800,0.00883,0.01144,0.01826,0.02650,0.01328,25.11900,1,0.358773,0.726652,-6.271690,0.196102,2.314209,0.162999
phon_R01_S06_5,155.35800,227.38300,80.05500,0.00310,0.00002,0.00159,0.00176,0.00476,0.01718,0.16100,0.00769,0.01012,0.01661,0.02307,0.00677,25.97000,1,0.470478,0.676258,-7.120925,0.279789,2.241742,0.108514
phon_R01_S06_6,162.56800,198.34600,77.63000,0.00502,0.00003,0.00280,0.00253,0.00841,0.01791,0.16800,0.00793,0.01057,0.01799,0.02380,0.01170,25.67800,1,0.427785,0.723797,-6.635729,0.209866,1.957961,0.135242
phon_R01_S07_1,197.07600,206.89600,192.05500,0.00289,0.00001,0.00166,0.00168,0.00498,0.01098,0.09700,0.00563,0.00680,0.00802,0.01689,0.00339,26.77500,0,0.422229,0.741367,-7.348300,0.177551,1.743867,0.085569
phon_R01_S07_2,199.22800,209.51200,192.09100,0.00241,0.00001,0.00134,0.00138,0.00402,0.01015,0.08900,0.00504,0.00641,0.00762,0.01513,0.00167,30.94000,0,0.432439,0.742055,-7.682587,0.173319,2.103106,0.068501
phon_R01_S07_3,198.38300,215.20300,193.10400,0.00212,0.00001,0.00113,0.00135,0.00339,0.01263,0.11100,0.00640,0.00825,0.00951,0.01919,0.00119,30.77500,0,0.465946,0.738703,-7.067931,0.175181,1.512275,0.096320
phon_R01_S07_4,202.26600,211.60400,197.07900,0.00180,0.000009,0.00093,0.00107,0.00278,0.00954,0.08500,0.00469,0.00606,0.00719,0.01407,0.00072,32.68400,0,0.368535,0.742133,-7.695734,0.178540,1.544609,0.056141
phon_R01_S07_5,203.18400,211.52600,196.16000,0.00178,0.000009,0.00094,0.00106,0.00283,0.00958,0.08500,0.00468,0.00610,0.00726,0.01403,0.00065,33.04700,0,0.340068,0.741899,-7.964984,0.163519,1.423287,0.044539
phon_R01_S07_6,201.46400,210.56500,195.70800,0.00198,0.000010,0.00105,0.00115,0.00314,0.01194,0.10700,0.00586,0.00760,0.00957,0.01758,0.00135,31.73200,0,0.344252,0.742737,-7.777685,0.170183,2.447064,0.057610
phon_R01_S08_1,177.87600,192.92100,168.01300,0.00411,0.00002,0.00233,0.00241,0.00700,0.02126,0.18900,0.01154,0.01347,0.01612,0.03463,0.00586,23.21600,1,0.360148,0.778834,-6.149653,0.218037,2.477082,0.165827
phon_R01_S08_2,176.17000,185.60400,163.56400,0.00369,0.00002,0.00205,0.00218,0.00616,0.01851,0.16800,0.00938,0.01160,0.01491,0.02814,0.00340,24.95100,1,0.341435,0.783626,-6.006414,0.196371,2.536527,0.173218
phon_R01_S08_3,180.19800,201.24900,175.45600,0.00284,0.00002,0.00153,0.00166,0.00459,0.01444,0.13100,0.00726,0.00885,0.01190,0.02177,0.00231,26.73800,1,0.403884,0.766209,-6.452058,0.212294,2.269398,0.141929
phon_R01_S08_4,187.73300,202.32400,173.01500,0.00316,0.00002,0.00168,0.00182,0.00504,0.01663,0.15100,0.00829,0.01003,0.01366,0.02488,0.00265,26.31000,1,0.396793,0.758324,-6.006647,0.266892,2.382544,0.160691
phon_R01_S08_5,186.16300,197.72400,177.58400,0.00298,0.00002,0.00165,0.00175,0.00496,0.01495,0.13500,0.00774,0.00941,0.01233,0.02321,0.00231,26.82200,1,0.326480,0.765623,-6.647379,0.201095,2.374073,0.130554
phon_R01_S08_6,184.05500,196.53700,166.97700,0.00258,0.00001,0.00134,0.00147,0.00403,0.01463,0.13200,0.00742,0.00901,0.01234,0.02226,0.00257,26.45300,1,0.306443,0.759203,-7.044105,0.063412,2.361532,0.115730
phon_R01_S10_1,237.22600,247.32600,225.22700,0.00298,0.00001,0.00169,0.00182,0.00507,0.01752,0.16400,0.01035,0.01024,0.01133,0.03104,0.00740,22.73600,0,0.305062,0.654172,-7.310550,0.098648,2.416838,0.095032
phon_R01_S10_2,241.40400,248.83400,232.48300,0.00281,0.00001,0.00157,0.00173,0.00470,0.01760,0.15400,0.01006,0.01038,0.01251,0.03017,0.00675,23.14500,0,0.457702,0.634267,-6.793547,0.158266,2.256699,0.117399
phon_R01_S10_3,243.43900,250.91200,232.43500,0.00210,0.000009,0.00109,0.00137,0.00327,0.01419,0.12600,0.00777,0.00898,0.01033,0.02330,0.00454,25.36800,0,0.438296,0.635285,-7.057869,0.091608,2.330716,0.091470
phon_R01_S10_4,242.85200,255.03400,227.91100,0.00225,0.000009,0.00117,0.00139,0.00350,0.01494,0.13400,0.00847,0.00879,0.01014,0.02542,0.00476,25.03200,0,0.431285,0.638928,-6.995820,0.102083,2.365800,0.102706
phon_R01_S10_5,245.51000,262.09000,231.84800,0.00235,0.000010,0.00127,0.00148,0.00380,0.01608,0.14100,0.00906,0.00977,0.01149,0.02719,0.00476,24.60200,0,0.467489,0.631653,-7.156076,0.127642,2.392122,0.097336
phon_R01_S10_6,252.45500,261.48700,182.78600,0.00185,0.000007,0.00092,0.00113,0.00276,0.01152,0.10300,0.00614,0.00730,0.00860,0.01841,0.00432,26.80500,0,0.610367,0.635204,-7.319510,0.200873,2.028612,0.086398
phon_R01_S13_1,122.18800,128.61100,115.76500,0.00524,0.00004,0.00169,0.00203,0.00507,0.01613,0.14300,0.00855,0.00776,0.01433,0.02566,0.00839,23.16200,0,0.579597,0.733659,-6.439398,0.266392,2.079922,0.133867
phon_R01_S13_2,122.96400,130.04900,114.67600,0.00428,0.00003,0.00124,0.00155,0.00373,0.01681,0.15400,0.00930,0.00802,0.01400,0.02789,0.00462,24.97100,0,0.538688,0.754073,-6.482096,0.264967,2.054419,0.128872
phon_R01_S13_3,124.44500,135.06900,117.49500,0.00431,0.00003,0.00141,0.00167,0.00422,0.02184,0.19700,0.01241,0.01024,0.01685,0.03724,0.00479,25.13500,0,0.553134,0.775933,-6.650471,0.254498,1.840198,0.103561
phon_R01_S13_4,126.34400,134.23100,112.77300,0.00448,0.00004,0.00131,0.00169,0.00393,0.02033,0.18500,0.01143,0.00959,0.01614,0.03429,0.00474,25.03000,0,0.507504,0.760361,-6.689151,0.291954,2.431854,0.105993
phon_R01_S13_5,128.00100,138.05200,122.08000,0.00436,0.00003,0.00137,0.00166,0.00411,0.02297,0.21000,0.01323,0.01072,0.01677,0.03969,0.00481,24.69200,0,0.459766,0.766204,-7.072419,0.220434,1.972297,0.119308
phon_R01_S13_6,129.33600,139.86700,118.60400,0.00490,0.00004,0.00165,0.00183,0.00495,0.02498,0.22800,0.01396,0.01219,0.01947,0.04188,0.00484,25.42900,0,0.420383,0.785714,-6.836811,0.269866,2.223719,0.147491
phon_R01_S16_1,108.80700,134.65600,102.87400,0.00761,0.00007,0.00349,0.00486,0.01046,0.02719,0.25500,0.01483,0.01609,0.02067,0.04450,0.01036,21.02800,1,0.536009,0.819032,-4.649573,0.205558,1.986899,0.316700
phon_R01_S16_2,109.86000,126.35800,104.43700,0.00874,0.00008,0.00398,0.00539,0.01193,0.03209,0.30700,0.01789,0.01992,0.02454,0.05368,0.01180,20.76700,1,0.558586,0.811843,-4.333543,0.221727,2.014606,0.344834
phon_R01_S16_3,110.41700,131.06700,103.37000,0.00784,0.00007,0.00352,0.00514,0.01056,0.03715,0.33400,0.02032,0.02302,0.02802,0.06097,0.00969,21.42200,1,0.541781,0.821364,-4.438453,0.238298,1.922940,0.335041
phon_R01_S16_4,117.27400,129.91600,110.40200,0.00752,0.00006,0.00299,0.00469,0.00898,0.02293,0.22100,0.01189,0.01459,0.01948,0.03568,0.00681,22.81700,1,0.530529,0.817756,-4.608260,0.290024,2.021591,0.314464
phon_R01_S16_5,116.87900,131.89700,108.15300,0.00788,0.00007,0.00334,0.00493,0.01003,0.02645,0.26500,0.01394,0.01625,0.02137,0.04183,0.00786,22.60300,1,0.540049,0.813432,-4.476755,0.262633,1.827012,0.326197
phon_R01_S16_6,114.84700,271.31400,104.68000,0.00867,0.00008,0.00373,0.00520,0.01120,0.03225,0.35000,0.01805,0.01974,0.02519,0.05414,0.01143,21.66000,1,0.547975,0.817396,-4.609161,0.221711,1.831691,0.316395
phon_R01_S17_1,209.14400,237.49400,109.37900,0.00282,0.00001,0.00147,0.00152,0.00442,0.01861,0.17000,0.00975,0.01258,0.01382,0.02925,0.00871,25.55400,0,0.341788,0.678874,-7.040508,0.066994,2.460791,0.101516
phon_R01_S17_2,223.36500,238.98700,98.66400,0.00264,0.00001,0.00154,0.00151,0.00461,0.01906,0.16500,0.01013,0.01296,0.01340,0.03039,0.00301,26.13800,0,0.447979,0.686264,-7.293801,0.086372,2.321560,0.098555
phon_R01_S17_3,222.23600,231.34500,205.49500,0.00266,0.00001,0.00152,0.00144,0.00457,0.01643,0.14500,0.00867,0.01108,0.01200,0.02602,0.00340,25.85600,0,0.364867,0.694399,-6.966321,0.095882,2.278687,0.103224
phon_R01_S17_4,228.83200,234.61900,223.63400,0.00296,0.00001,0.00175,0.00155,0.00526,0.01644,0.14500,0.00882,0.01075,0.01179,0.02647,0.00351,25.96400,0,0.256570,0.683296,-7.245620,0.018689,2.498224,0.093534
phon_R01_S17_5,229.40100,252.22100,221.15600,0.00205,0.000009,0.00114,0.00113,0.00342,0.01457,0.12900,0.00769,0.00957,0.01016,0.02308,0.00300,26.41500,0,0.276850,0.673636,-7.496264,0.056844,2.003032,0.073581
phon_R01_S17_6,228.96900,239.54100,113.20100,0.00238,0.00001,0.00136,0.00140,0.00408,0.01745,0.15400,0.00942,0.01160,0.01234,0.02827,0.00420,24.54700,0,0.305429,0.681811,-7.314237,0.006274,2.118596,0.091546
phon_R01_S18_1,140.34100,159.77400,67.02100,0.00817,0.00006,0.00430,0.00440,0.01289,0.03198,0.31300,0.01830,0.01810,0.02428,0.05490,0.02183,19.56000,1,0.460139,0.720908,-5.409423,0.226850,2.359973,0.226156
phon_R01_S18_2,136.96900,166.60700,66.00400,0.00923,0.00007,0.00507,0.00463,0.01520,0.03111,0.30800,0.01638,0.01759,0.02603,0.04914,0.02659,19.97900,1,0.498133,0.729067,-5.324574,0.205660,2.291558,0.226247
phon_R01_S18_3,143.53300,162.21500,65.80900,0.01101,0.00008,0.00647,0.00467,0.01941,0.05384,0.47800,0.03152,0.02422,0.03392,0.09455,0.04882,20.33800,1,0.513237,0.731444,-5.869750,0.151814,2.118496,0.185580
phon_R01_S18_4,148.09000,162.82400,67.34300,0.00762,0.00005,0.00467,0.00354,0.01400,0.05428,0.49700,0.03357,0.02494,0.03635,0.10070,0.02431,21.71800,1,0.487407,0.727313,-6.261141,0.120956,2.137075,0.141958
phon_R01_S18_5,142.72900,162.40800,65.47600,0.00831,0.00006,0.00469,0.00419,0.01407,0.03485,0.36500,0.01868,0.01906,0.02949,0.05605,0.02599,20.26400,1,0.489345,0.730387,-5.720868,0.158830,2.277927,0.180828
phon_R01_S18_6,136.35800,176.59500,65.75000,0.00971,0.00007,0.00534,0.00478,0.01601,0.04978,0.48300,0.02749,0.02466,0.03736,0.08247,0.03361,18.57000,1,0.543299,0.733232,-5.207985,0.224852,2.642276,0.242981
phon_R01_S19_1,120.08000,139.71000,111.20800,0.00405,0.00003,0.00180,0.00220,0.00540,0.01706,0.15200,0.00974,0.00925,0.01345,0.02921,0.00442,25.74200,1,0.495954,0.762959,-5.791820,0.329066,2.205024,0.188180
phon_R01_S19_2,112.01400,588.51800,107.02400,0.00533,0.00005,0.00268,0.00329,0.00805,0.02448,0.22600,0.01373,0.01375,0.01956,0.04120,0.00623,24.17800,1,0.509127,0.789532,-5.389129,0.306636,1.928708,0.225461
phon_R01_S19_3,110.79300,128.10100,107.31600,0.00494,0.00004,0.00260,0.00283,0.00780,0.02442,0.21600,0.01432,0.01325,0.01831,0.04295,0.00479,25.43800,1,0.437031,0.815908,-5.313360,0.201861,2.225815,0.244512
phon_R01_S19_4,110.70700,122.61100,105.00700,0.00516,0.00005,0.00277,0.00289,0.00831,0.02215,0.20600,0.01284,0.01219,0.01715,0.03851,0.00472,25.19700,1,0.463514,0.807217,-5.477592,0.315074,1.862092,0.228624
phon_R01_S19_5,112.87600,148.82600,106.98100,0.00500,0.00004,0.00270,0.00289,0.00810,0.03999,0.35000,0.02413,0.02231,0.02704,0.07238,0.00905,23.37000,1,0.489538,0.789977,-5.775966,0.341169,2.007923,0.193918
phon_R01_S19_6,110.56800,125.39400,106.82100,0.00462,0.00004,0.00226,0.00280,0.00677,0.02199,0.19700,0.01284,0.01199,0.01636,0.03852,0.00420,25.82000,1,0.429484,0.816340,-5.391029,0.250572,1.777901,0.232744
phon_R01_S20_1,95.38500,102.14500,90.26400,0.00608,0.00006,0.00331,0.00332,0.00994,0.03202,0.26300,0.01803,0.01886,0.02455,0.05408,0.01062,21.87500,1,0.644954,0.779612,-5.115212,0.249494,2.017753,0.260015
phon_R01_S20_2,100.77000,115.69700,85.54500,0.01038,0.00010,0.00622,0.00576,0.01865,0.03121,0.36100,0.01773,0.01783,0.02139,0.05320,0.02220,19.20000,1,0.594387,0.790117,-4.913885,0.265699,2.398422,0.277948
phon_R01_S20_3,96.10600,108.66400,84.51000,0.00694,0.00007,0.00389,0.00415,0.01168,0.04024,0.36400,0.02266,0.02451,0.02876,0.06799,0.01823,19.05500,1,0.544805,0.770466,-4.441519,0.155097,2.645959,0.327978
phon_R01_S20_4,95.60500,107.71500,87.54900,0.00702,0.00007,0.00428,0.00371,0.01283,0.03156,0.29600,0.01792,0.01841,0.02190,0.05377,0.01825,19.65900,1,0.576084,0.778747,-5.132032,0.210458,2.232576,0.260633
phon_R01_S20_5,100.96000,110.01900,95.62800,0.00606,0.00006,0.00351,0.00348,0.01053,0.02427,0.21600,0.01371,0.01421,0.01751,0.04114,0.01237,20.53600,1,0.554610,0.787896,-5.022288,0.146948,2.428306,0.264666
phon_R01_S20_6,98.80400,102.30500,87.80400,0.00432,0.00004,0.00247,0.00258,0.00742,0.02223,0.20200,0.01277,0.01343,0.01552,0.03831,0.00882,22.24400,1,0.576644,0.772416,-6.025367,0.078202,2.053601,0.177275
phon_R01_S21_1,176.85800,205.56000,75.34400,0.00747,0.00004,0.00418,0.00420,0.01254,0.04795,0.43500,0.02679,0.03022,0.03510,0.08037,0.05470,13.89300,1,0.556494,0.729586,-5.288912,0.343073,3.099301,0.242119
phon_R01_S21_2,180.97800,200.12500,155.49500,0.00406,0.00002,0.00220,0.00244,0.00659,0.03852,0.33100,0.02107,0.02493,0.02877,0.06321,0.02782,16.17600,1,0.583574,0.727747,-5.657899,0.315903,3.098256,0.200423
phon_R01_S21_3,178.22200,202.45000,141.04700,0.00321,0.00002,0.00163,0.00194,0.00488,0.03759,0.32700,0.02073,0.02415,0.02784,0.06219,0.03151,15.92400,1,0.598714,0.712199,-6.366916,0.335753,2.654271,0.144614
phon_R01_S21_4,176.28100,227.38100,125.61000,0.00520,0.00003,0.00287,0.00312,0.00862,0.06511,0.58000,0.03671,0.04159,0.04683,0.11012,0.04824,13.92200,1,0.602874,0.740837,-5.515071,0.299549,3.136550,0.220968
phon_R01_S21_5,173.89800,211.35000,74.67700,0.00448,0.00003,0.00237,0.00254,0.00710,0.06727,0.65000,0.03788,0.04254,0.04802,0.11363,0.04214,14.73900,1,0.599371,0.743937,-5.783272,0.299793,3.007096,0.194052
phon_R01_S21_6,179.71100,225.93000,144.87800,0.00709,0.00004,0.00391,0.00419,0.01172,0.04313,0.44200,0.02297,0.02768,0.03455,0.06892,0.07223,11.86600,1,0.590951,0.745526,-4.379411,0.375531,3.671155,0.332086
phon_R01_S21_7,166.60500,206.00800,78.03200,0.00742,0.00004,0.00387,0.00453,0.01161,0.06640,0.63400,0.03650,0.04282,0.05114,0.10949,0.08725,11.74400,1,0.653410,0.733165,-4.508984,0.389232,3.317586,0.301952
phon_R01_S22_1,151.95500,163.33500,147.22600,0.00419,0.00003,0.00224,0.00227,0.00672,0.07959,0.77200,0.04421,0.04962,0.05690,0.13262,0.01658,19.66400,1,0.501037,0.714360,-6.411497,0.207156,2.344876,0.134120
phon_R01_S22_2,148.27200,164.98900,142.29900,0.00459,0.00003,0.00250,0.00256,0.00750,0.04190,0.38300,0.02383,0.02521,0.03051,0.07150,0.01914,18.78000,1,0.454444,0.734504,-5.952058,0.087840,2.344336,0.186489
phon_R01_S22_3,152.12500,161.46900,76.59600,0.00382,0.00003,0.00191,0.00226,0.00574,0.05925,0.63700,0.03341,0.03794,0.04398,0.10024,0.01211,20.96900,1,0.447456,0.697790,-6.152551,0.173520,2.080121,0.160809
phon_R01_S22_4,157.82100,172.97500,68.40100,0.00358,0.00002,0.00196,0.00196,0.00587,0.03716,0.30700,0.02062,0.02321,0.02764,0.06185,0.00850,22.21900,1,0.502380,0.712170,-6.251425,0.188056,2.143851,0.160812
phon_R01_S22_5,157.44700,163.26700,149.60500,0.00369,0.00002,0.00201,0.00197,0.00602,0.03272,0.28300,0.01813,0.01909,0.02571,0.05439,0.01018,21.69300,1,0.447285,0.705658,-6.247076,0.180528,2.344348,0.164916
phon_R01_S22_6,159.11600,168.91300,144.81100,0.00342,0.00002,0.00178,0.00184,0.00535,0.03381,0.30700,0.01806,0.02024,0.02809,0.05417,0.00852,22.66300,1,0.366329,0.693429,-6.417440,0.194627,2.473239,0.151709
phon_R01_S24_1,125.03600,143.94600,116.18700,0.01280,0.00010,0.00743,0.00623,0.02228,0.03886,0.34200,0.02135,0.02174,0.03088,0.06406,0.08151,15.33800,1,0.629574,0.714485,-4.020042,0.265315,2.671825,0.340623
phon_R01_S24_2,125.79100,140.55700,96.20600,0.01378,0.00011,0.00826,0.00655,0.02478,0.04689,0.42200,0.02542,0.02630,0.03908,0.07625,0.10323,15.43300,1,0.571010,0.690892,-5.159169,0.202146,2.441612,0.260375
phon_R01_S24_3,126.51200,141.75600,99.77000,0.01936,0.00015,0.01159,0.00990,0.03476,0.06734,0.65900,0.03611,0.03963,0.05783,0.10833,0.16744,12.43500,1,0.638545,0.674953,-3.760348,0.242861,2.634633,0.378483
phon_R01_S24_4,125.64100,141.06800,116.34600,0.03316,0.00026,0.02144,0.01522,0.06433,0.09178,0.89100,0.05358,0.04791,0.06196,0.16074,0.31482,8.86700,1,0.671299,0.656846,-3.700544,0.260481,2.991063,0.370961
phon_R01_S24_5,128.45100,150.44900,75.63200,0.01551,0.00012,0.00905,0.00909,0.02716,0.06170,0.58400,0.03223,0.03672,0.05174,0.09669,0.11843,15.06000,1,0.639808,0.643327,-4.202730,0.310163,2.638279,0.356881
phon_R01_S24_6,139.22400,586.56700,66.15700,0.03011,0.00022,0.01854,0.01628,0.05563,0.09419,0.93000,0.05551,0.05005,0.06023,0.16654,0.25930,10.48900,1,0.596362,0.641418,-3.269487,0.270641,2.690917,0.444774
phon_R01_S25_1,150.25800,154.60900,75.34900,0.00248,0.00002,0.00105,0.00136,0.00315,0.01131,0.10700,0.00522,0.00659,0.01009,0.01567,0.00495,26.75900,1,0.296888,0.722356,-6.878393,0.089267,2.004055,0.113942
phon_R01_S25_2,154.00300,160.26700,128.62100,0.00183,0.00001,0.00076,0.00100,0.00229,0.01030,0.09400,0.00469,0.00582,0.00871,0.01406,0.00243,28.40900,1,0.263654,0.691483,-7.111576,0.144780,2.065477,0.093193
phon_R01_S25_3,149.68900,160.36800,133.60800,0.00257,0.00002,0.00116,0.00134,0.00349,0.01346,0.12600,0.00660,0.00818,0.01059,0.01979,0.00578,27.42100,1,0.365488,0.719974,-6.997403,0.210279,1.994387,0.112878
phon_R01_S25_4,155.07800,163.73600,144.14800,0.00168,0.00001,0.00068,0.00092,0.00204,0.01064,0.09700,0.00522,0.00632,0.00928,0.01567,0.00233,29.74600,1,0.334171,0.677930,-6.981201,0.184550,2.129924,0.106802
phon_R01_S25_5,151.88400,157.76500,133.75100,0.00258,0.00002,0.00115,0.00122,0.00346,0.01450,0.13700,0.00633,0.00788,0.01267,0.01898,0.00659,26.83300,1,0.393563,0.700246,-6.600023,0.249172,2.499148,0.105306
phon_R01_S25_6,151.98900,157.33900,132.85700,0.00174,0.00001,0.00075,0.00096,0.00225,0.01024,0.09300,0.00455,0.00576,0.00993,0.01364,0.00238,29.92800,1,0.311369,0.676066,-6.739151,0.160686,2.296873,0.115130
phon_R01_S26_1,193.03000,208.90000,80.29700,0.00766,0.00004,0.00450,0.00389,0.01351,0.03044,0.27500,0.01771,0.01815,0.02084,0.05312,0.00947,21.93400,1,0.497554,0.740539,-5.845099,0.278679,2.608749,0.185668
phon_R01_S26_2,200.71400,223.98200,89.68600,0.00621,0.00003,0.00371,0.00337,0.01112,0.02286,0.20700,0.01192,0.01439,0.01852,0.03576,0.00704,23.23900,1,0.436084,0.727863,-5.258320,0.256454,2.550961,0.232520
phon_R01_S26_3,208.51900,220.31500,199.02000,0.00609,0.00003,0.00368,0.00339,0.01105,0.01761,0.15500,0.00952,0.01058,0.01307,0.02855,0.00830,22.40700,1,0.338097,0.712466,-6.471427,0.184378,2.502336,0.136390
phon_R01_S26_4,204.66400,221.30000,189.62100,0.00841,0.00004,0.00502,0.00485,0.01506,0.02378,0.21000,0.01277,0.01483,0.01767,0.03831,0.01316,21.30500,1,0.498877,0.722085,-4.876336,0.212054,2.376749,0.268144
phon_R01_S26_5,210.14100,232.70600,185.25800,0.00534,0.00003,0.00321,0.00280,0.00964,0.01680,0.14900,0.00861,0.01017,0.01301,0.02583,0.00620,23.67100,1,0.441097,0.722254,-5.963040,0.250283,2.489191,0.177807
phon_R01_S26_6,206.32700,226.35500,92.02000,0.00495,0.00002,0.00302,0.00246,0.00905,0.02105,0.20900,0.01107,0.01284,0.01604,0.03320,0.01048,21.86400,1,0.331508,0.715121,-6.729713,0.181701,2.938114,0.115515
phon_R01_S27_1,151.87200,492.89200,69.08500,0.00856,0.00006,0.00404,0.00385,0.01211,0.01843,0.23500,0.00796,0.00832,0.01271,0.02389,0.06051,23.69300,1,0.407701,0.662668,-4.673241,0.261549,2.702355,0.274407
phon_R01_S27_2,158.21900,442.55700,71.94800,0.00476,0.00003,0.00214,0.00207,0.00642,0.01458,0.14800,0.00606,0.00747,0.01312,0.01818,0.01554,26.35600,1,0.450798,0.653823,-6.051233,0.273280,2.640798,0.170106
phon_R01_S27_3,170.75600,450.24700,79.03200,0.00555,0.00003,0.00244,0.00261,0.00731,0.01725,0.17500,0.00757,0.00971,0.01652,0.02270,0.01802,25.69000,1,0.486738,0.676023,-4.597834,0.372114,2.975889,0.282780
phon_R01_S27_4,178.28500,442.82400,82.06300,0.00462,0.00003,0.00157,0.00194,0.00472,0.01279,0.12900,0.00617,0.00744,0.01151,0.01851,0.00856,25.02000,1,0.470422,0.655239,-4.913137,0.393056,2.816781,0.251972
phon_R01_S27_5,217.11600,233.48100,93.97800,0.00404,0.00002,0.00127,0.00128,0.00381,0.01299,0.12400,0.00679,0.00631,0.01075,0.02038,0.00681,24.58100,1,0.462516,0.582710,-5.517173,0.389295,2.925862,0.220657
phon_R01_S27_6,128.94000,479.69700,88.25100,0.00581,0.00005,0.00241,0.00314,0.00723,0.02008,0.22100,0.00849,0.01117,0.01734,0.02548,0.02350,24.74300,1,0.487756,0.684130,-6.186128,0.279933,2.686240,0.152428
phon_R01_S27_7,176.82400,215.29300,83.96100,0.00460,0.00003,0.00209,0.00221,0.00628,0.01169,0.11700,0.00534,0.00630,0.01104,0.01603,0.01161,27.16600,1,0.400088,0.656182,-4.711007,0.281618,2.655744,0.234809
phon_R01_S31_1,138.19000,203.52200,83.34000,0.00704,0.00005,0.00406,0.00398,0.01218,0.04479,0.44100,0.02587,0.02567,0.03220,0.07761,0.01968,18.30500,1,0.538016,0.741480,-5.418787,0.160267,2.090438,0.229892
phon_R01_S31_2,182.01800,197.17300,79.18700,0.00842,0.00005,0.00506,0.00449,0.01517,0.02503,0.23100,0.01372,0.01580,0.01931,0.04115,0.01813,18.78400,1,0.589956,0.732903,-5.445140,0.142466,2.174306,0.215558
phon_R01_S31_3,156.23900,195.10700,79.82000,0.00694,0.00004,0.00403,0.00395,0.01209,0.02343,0.22400,0.01289,0.01420,0.01720,0.03867,0.02020,19.19600,1,0.618663,0.728421,-5.944191,0.143359,1.929715,0.181988
phon_R01_S31_4,145.17400,198.10900,80.63700,0.00733,0.00005,0.00414,0.00422,0.01242,0.02362,0.23300,0.01235,0.01495,0.01944,0.03706,0.01874,18.85700,1,0.637518,0.735546,-5.594275,0.127950,1.765957,0.222716
phon_R01_S31_5,138.14500,197.23800,81.11400,0.00544,0.00004,0.00294,0.00327,0.00883,0.02791,0.24600,0.01484,0.01805,0.02259,0.04451,0.01794,18.17800,1,0.623209,0.738245,-5.540351,0.087165,1.821297,0.214075
phon_R01_S31_6,166.88800,198.96600,79.51200,0.00638,0.00004,0.00368,0.00351,0.01104,0.02857,0.25700,0.01547,0.01859,0.02301,0.04641,0.01796,18.33000,1,0.585169,0.736964,-5.825257,0.115697,1.996146,0.196535
phon_R01_S32_1,119.03100,127.53300,109.21600,0.00440,0.00004,0.00214,0.00192,0.00641,0.01033,0.09800,0.00538,0.00570,0.00811,0.01614,0.01724,26.84200,1,0.457541,0.699787,-6.890021,0.152941,2.328513,0.112856
phon_R01_S32_2,120.07800,126.63200,105.66700,0.00270,0.00002,0.00116,0.00135,0.00349,0.01022,0.09000,0.00476,0.00588,0.00903,0.01428,0.00487,26.36900,1,0.491345,0.718839,-5.892061,0.195976,2.108873,0.183572
phon_R01_S32_3,120.28900,128.14300,100.20900,0.00492,0.00004,0.00269,0.00238,0.00808,0.01412,0.12500,0.00703,0.00820,0.01194,0.02110,0.01610,23.94900,1,0.467160,0.724045,-6.135296,0.203630,2.539724,0.169923
phon_R01_S32_4,120.25600,125.30600,104.77300,0.00407,0.00003,0.00224,0.00205,0.00671,0.01516,0.13800,0.00721,0.00815,0.01310,0.02164,0.01015,26.01700,1,0.468621,0.735136,-6.112667,0.217013,2.527742,0.170633
phon_R01_S32_5,119.05600,125.21300,86.79500,0.00346,0.00003,0.00169,0.00170,0.00508,0.01201,0.10600,0.00633,0.00701,0.00915,0.01898,0.00903,23.38900,1,0.470972,0.721308,-5.436135,0.254909,2.516320,0.232209
phon_R01_S32_6,118.74700,123.72300,109.83600,0.00331,0.00003,0.00168,0.00171,0.00504,0.01043,0.09900,0.00490,0.00621,0.00903,0.01471,0.00504,25.61900,1,0.482296,0.723096,-6.448134,0.178713,2.034827,0.141422
phon_R01_S33_1,106.51600,112.77700,93.10500,0.00589,0.00006,0.00291,0.00319,0.00873,0.04932,0.44100,0.02683,0.03112,0.03651,0.08050,0.03031,17.06000,1,0.637814,0.744064,-5.301321,0.320385,2.375138,0.243080
phon_R01_S33_2,110.45300,127.61100,105.55400,0.00494,0.00004,0.00244,0.00315,0.00731,0.04128,0.37900,0.02229,0.02592,0.03316,0.06688,0.02529,17.70700,1,0.653427,0.706687,-5.333619,0.322044,2.631793,0.228319
phon_R01_S33_3,113.40000,133.34400,107.81600,0.00451,0.00004,0.00219,0.00283,0.00658,0.04879,0.43100,0.02385,0.02973,0.04370,0.07154,0.02278,19.01300,1,0.647900,0.708144,-4.378916,0.300067,2.445502,0.259451
phon_R01_S33_4,113.16600,130.27000,100.67300,0.00502,0.00004,0.00257,0.00312,0.00772,0.05279,0.47600,0.02896,0.03347,0.04134,0.08689,0.03690,16.74700,1,0.625362,0.708617,-4.654894,0.304107,2.672362,0.274387
phon_R01_S33_5,112.23900,126.60900,104.09500,0.00472,0.00004,0.00238,0.00290,0.00715,0.05643,0.51700,0.03070,0.03530,0.04451,0.09211,0.02629,17.36600,1,0.640945,0.701404,-5.634576,0.306014,2.419253,0.209191
phon_R01_S33_6,116.15000,131.73100,109.81500,0.00381,0.00003,0.00181,0.00232,0.00542,0.03026,0.26700,0.01514,0.01812,0.02770,0.04543,0.01827,18.80100,1,0.624811,0.696049,-5.866357,0.233070,2.445646,0.184985
phon_R01_S34_1,170.36800,268.79600,79.54300,0.00571,0.00003,0.00232,0.00269,0.00696,0.03273,0.28100,0.01713,0.01964,0.02824,0.05139,0.02485,18.54000,1,0.677131,0.685057,-4.796845,0.397749,2.963799,0.277227
phon_R01_S34_2,208.08300,253.79200,91.80200,0.00757,0.00004,0.00428,0.00428,0.01285,0.06725,0.57100,0.04016,0.04003,0.04464,0.12047,0.04238,15.64800,1,0.606344,0.665945,-5.410336,0.288917,2.665133,0.231723
phon_R01_S34_3,198.45800,219.29000,148.69100,0.00376,0.00002,0.00182,0.00215,0.00546,0.03527,0.29700,0.02055,0.02076,0.02530,0.06165,0.01728,18.70200,1,0.606273,0.661735,-5.585259,0.310746,2.465528,0.209863
phon_R01_S34_4,202.80500,231.50800,86.23200,0.00370,0.00002,0.00189,0.00211,0.00568,0.01997,0.18000,0.01117,0.01177,0.01506,0.03350,0.02010,18.68700,1,0.536102,0.632631,-5.898673,0.213353,2.470746,0.189032
phon_R01_S34_5,202.54400,241.35000,164.16800,0.00254,0.00001,0.00100,0.00133,0.00301,0.02662,0.22800,0.01475,0.01558,0.02006,0.04426,0.01049,20.68000,1,0.497480,0.630409,-6.132663,0.220617,2.576563,0.159777
phon_R01_S34_6,223.36100,263.87200,87.63800,0.00352,0.00002,0.00169,0.00188,0.00506,0.02536,0.22500,0.01379,0.01478,0.01909,0.04137,0.01493,20.36600,1,0.566849,0.574282,-5.456811,0.345238,2.840556,0.232861
phon_R01_S35_1,169.77400,191.75900,151.45100,0.01568,0.00009,0.00863,0.00946,0.02589,0.08143,0.82100,0.03804,0.05426,0.08808,0.11411,0.07530,12.35900,1,0.561610,0.793509,-3.297668,0.414758,3.413649,0.457533
phon_R01_S35_2,183.52000,216.81400,161.34000,0.01466,0.00008,0.00849,0.00819,0.02546,0.06050,0.61800,0.02865,0.04101,0.06359,0.08595,0.06057,14.36700,1,0.478024,0.768974,-4.276605,0.355736,3.142364,0.336085
phon_R01_S35_3,188.62000,216.30200,165.98200,0.01719,0.00009,0.00996,0.01027,0.02987,0.07118,0.72200,0.03474,0.04580,0.06824,0.10422,0.08069,12.29800,1,0.552870,0.764036,-3.377325,0.335357,3.274865,0.418646
phon_R01_S35_4,202.63200,565.74000,177.25800,0.01627,0.00008,0.00919,0.00963,0.02756,0.07170,0.83300,0.03515,0.04265,0.06460,0.10546,0.07889,14.98900,1,0.427627,0.775708,-4.892495,0.262281,2.910213,0.270173
phon_R01_S35_5,186.69500,211.96100,149.44200,0.01872,0.00010,0.01075,0.01154,0.03225,0.05830,0.78400,0.02699,0.03714,0.06259,0.08096,0.10952,12.52900,1,0.507826,0.762726,-4.484303,0.340256,2.958815,0.301487
phon_R01_S35_6,192.81800,224.42900,168.79300,0.03107,0.00016,0.01800,0.01958,0.05401,0.11908,1.30200,0.05647,0.07940,0.13778,0.16942,0.21713,8.44100,1,0.625866,0.768320,-2.434031,0.450493,3.079221,0.527367
phon_R01_S35_7,198.11600,233.09900,174.47800,0.02714,0.00014,0.01568,0.01699,0.04705,0.08684,1.01800,0.04284,0.05556,0.08318,0.12851,0.16265,9.44900,1,0.584164,0.754449,-2.839756,0.356224,3.184027,0.454721
phon_R01_S37_1,121.34500,139.64400,98.25000,0.00684,0.00006,0.00388,0.00332,0.01164,0.02534,0.24100,0.01340,0.01399,0.02056,0.04019,0.04179,21.52000,1,0.566867,0.670475,-4.865194,0.246404,2.013530,0.168581
phon_R01_S37_2,119.10000,128.44200,88.83300,0.00692,0.00006,0.00393,0.00300,0.01179,0.02682,0.23600,0.01484,0.01405,0.02018,0.04451,0.04611,21.82400,1,0.651680,0.659333,-4.239028,0.175691,2.451130,0.247455
phon_R01_S37_3,117.87000,127.34900,95.65400,0.00647,0.00005,0.00356,0.00300,0.01067,0.03087,0.27600,0.01659,0.01804,0.02402,0.04977,0.02631,22.43100,1,0.628300,0.652025,-3.583722,0.207914,2.439597,0.206256
phon_R01_S37_4,122.33600,142.36900,94.79400,0.00727,0.00006,0.00415,0.00339,0.01246,0.02293,0.22300,0.01205,0.01289,0.01771,0.03615,0.03191,22.95300,1,0.611679,0.623731,-5.435100,0.230532,2.699645,0.220546
phon_R01_S37_5,117.96300,134.20900,100.75700,0.01813,0.00015,0.01117,0.00718,0.03351,0.04912,0.43800,0.02610,0.02161,0.02916,0.07830,0.10748,19.07500,1,0.630547,0.646786,-3.444478,0.303214,2.964568,0.261305
phon_R01_S37_6,126.14400,154.28400,97.54300,0.00975,0.00008,0.00593,0.00454,0.01778,0.02852,0.26600,0.01500,0.01581,0.02157,0.04499,0.03828,21.53400,1,0.635015,0.627337,-5.070096,0.280091,2.892300,0.249703
phon_R01_S39_1,127.93000,138.75200,112.17300,0.00605,0.00005,0.00321,0.00318,0.00962,0.03235,0.33900,0.01360,0.01650,0.03105,0.04079,0.02663,19.65100,1,0.654945,0.675865,-5.498456,0.234196,2.103014,0.216638
phon_R01_S39_2,114.23800,124.39300,77.02200,0.00581,0.00005,0.00299,0.00316,0.00896,0.04009,0.40600,0.01579,0.01994,0.04114,0.04736,0.02073,20.43700,1,0.653139,0.694571,-5.185987,0.259229,2.151121,0.244948
phon_R01_S39_3,115.32200,135.73800,107.80200,0.00619,0.00005,0.00352,0.00329,0.01057,0.03273,0.32500,0.01644,0.01722,0.02931,0.04933,0.02810,19.38800,1,0.577802,0.684373,-5.283009,0.226528,2.442906,0.238281
phon_R01_S39_4,114.55400,126.77800,91.12100,0.00651,0.00006,0.00366,0.00340,0.01097,0.03658,0.36900,0.01864,0.01940,0.03091,0.05592,0.02707,18.95400,1,0.685151,0.719576,-5.529833,0.242750,2.408689,0.220520
phon_R01_S39_5,112.15000,131.66900,97.52700,0.00519,0.00005,0.00291,0.00284,0.00873,0.01756,0.15500,0.00967,0.01033,0.01363,0.02902,0.01435,21.21900,1,0.557045,0.673086,-5.617124,0.184896,1.871871,0.212386
phon_R01_S39_6,102.27300,142.83000,85.90200,0.00907,0.00009,0.00493,0.00461,0.01480,0.02814,0.27200,0.01579,0.01553,0.02073,0.04736,0.03882,18.44700,1,0.671378,0.674562,-2.929379,0.396746,2.560422,0.367233
phon_R01_S42_1,236.20000,244.66300,102.13700,0.00277,0.00001,0.00154,0.00153,0.00462,0.02448,0.21700,0.01410,0.01426,0.01621,0.04231,0.00620,24.07800,0,0.469928,0.628232,-6.816086,0.172270,2.235197,0.119652
phon_R01_S42_2,237.32300,243.70900,229.25600,0.00303,0.00001,0.00173,0.00159,0.00519,0.01242,0.11600,0.00696,0.00747,0.00882,0.02089,0.00533,24.67900,0,0.384868,0.626710,-7.018057,0.176316,1.852402,0.091604
phon_R01_S42_3,260.10500,264.91900,237.30300,0.00339,0.00001,0.00205,0.00186,0.00616,0.02030,0.19700,0.01186,0.01230,0.01367,0.03557,0.00910,21.08300,0,0.440988,0.628058,-7.517934,0.160414,1.881767,0.075587
phon_R01_S42_4,197.56900,217.62700,90.79400,0.00803,0.00004,0.00490,0.00448,0.01470,0.02177,0.18900,0.01279,0.01272,0.01439,0.03836,0.01337,19.26900,0,0.372222,0.725216,-5.736781,0.164529,2.882450,0.202879
phon_R01_S42_5,240.30100,245.13500,219.78300,0.00517,0.00002,0.00316,0.00283,0.00949,0.02018,0.21200,0.01176,0.01191,0.01344,0.03529,0.00965,21.02000,0,0.371837,0.646167,-7.169701,0.073298,2.266432,0.100881
phon_R01_S42_6,244.99000,272.21000,239.17000,0.00451,0.00002,0.00279,0.00237,0.00837,0.01897,0.18100,0.01084,0.01121,0.01255,0.03253,0.01049,21.52800,0,0.522812,0.646818,-7.304500,0.171088,2.095237,0.096220
phon_R01_S43_1,112.54700,133.37400,105.71500,0.00355,0.00003,0.00166,0.00190,0.00499,0.01358,0.12900,0.00664,0.00786,0.01140,0.01992,0.00435,26.43600,0,0.413295,0.756700,-6.323531,0.218885,2.193412,0.160376
phon_R01_S43_2,110.73900,113.59700,100.13900,0.00356,0.00003,0.00170,0.00200,0.00510,0.01484,0.13300,0.00754,0.00950,0.01285,0.02261,0.00430,26.55000,0,0.369090,0.776158,-6.085567,0.192375,1.889002,0.174152
phon_R01_S43_3,113.71500,116.44300,96.91300,0.00349,0.00003,0.00171,0.00203,0.00514,0.01472,0.13300,0.00748,0.00905,0.01148,0.02245,0.00478,26.54700,0,0.380253,0.766700,-5.943501,0.192150,1.852542,0.179677
phon_R01_S43_4,117.00400,144.46600,99.92300,0.00353,0.00003,0.00176,0.00218,0.00528,0.01657,0.14500,0.00881,0.01062,0.01318,0.02643,0.00590,25.44500,0,0.387482,0.756482,-6.012559,0.229298,1.872946,0.163118
phon_R01_S43_5,115.38000,123.10900,108.63400,0.00332,0.00003,0.00160,0.00199,0.00480,0.01503,0.13700,0.00812,0.00933,0.01133,0.02436,0.00401,26.00500,0,0.405991,0.761255,-5.966779,0.197938,1.974857,0.184067
phon_R01_S43_6,116.38800,129.03800,108.97000,0.00346,0.00003,0.00169,0.00213,0.00507,0.01725,0.15500,0.00874,0.01021,0.01331,0.02623,0.00415,26.14300,0,0.361232,0.763242,-6.016891,0.109256,2.004719,0.174429
phon_R01_S44_1,151.73700,190.20400,129.85900,0.00314,0.00002,0.00135,0.00162,0.00406,0.01469,0.13200,0.00728,0.00886,0.01230,0.02184,0.00570,24.15100,1,0.396610,0.745957,-6.486822,0.197919,2.449763,0.132703
phon_R01_S44_2,148.79000,158.35900,138.99000,0.00309,0.00002,0.00152,0.00186,0.00456,0.01574,0.14200,0.00839,0.00956,0.01309,0.02518,0.00488,24.41200,1,0.402591,0.762508,-6.311987,0.182459,2.251553,0.160306
phon_R01_S44_3,148.14300,155.98200,135.04100,0.00392,0.00003,0.00204,0.00231,0.00612,0.01450,0.13100,0.00725,0.00876,0.01263,0.02175,0.00540,23.68300,1,0.398499,0.778349,-5.711205,0.240875,2.845109,0.192730
phon_R01_S44_4,150.44000,163.44100,144.73600,0.00396,0.00003,0.00206,0.00233,0.00619,0.02551,0.23700,0.01321,0.01574,0.02148,0.03964,0.00611,23.13300,1,0.352396,0.759320,-6.261446,0.183218,2.264226,0.144105
phon_R01_S44_5,148.46200,161.07800,141.99800,0.00397,0.00003,0.00202,0.00235,0.00605,0.01831,0.16300,0.00950,0.01103,0.01559,0.02849,0.00639,22.86600,1,0.408598,0.768845,-5.704053,0.216204,2.679185,0.197710
phon_R01_S44_6,149.81800,163.41700,144.78600,0.00336,0.00002,0.00174,0.00198,0.00521,0.02145,0.19800,0.01155,0.01341,0.01666,0.03464,0.00595,23.00800,1,0.329577,0.757180,-6.277170,0.109397,2.209021,0.156368
phon_R01_S49_1,117.22600,123.92500,106.65600,0.00417,0.00004,0.00186,0.00270,0.00558,0.01909,0.17100,0.00864,0.01223,0.01949,0.02592,0.00955,23.07900,0,0.603515,0.669565,-5.619070,0.191576,2.027228,0.215724
phon_R01_S49_2,116.84800,217.55200,99.50300,0.00531,0.00005,0.00260,0.00346,0.00780,0.01795,0.16300,0.00810,0.01144,0.01756,0.02429,0.01179,22.08500,0,0.663842,0.656516,-5.198864,0.206768,2.120412,0.252404
phon_R01_S49_3,116.28600,177.29100,96.98300,0.00314,0.00003,0.00134,0.00192,0.00403,0.01564,0.13600,0.00667,0.00990,0.01691,0.02001,0.00737,24.19900,0,0.598515,0.654331,-5.592584,0.133917,2.058658,0.214346
phon_R01_S49_4,116.55600,592.03000,86.22800,0.00496,0.00004,0.00254,0.00263,0.00762,0.01660,0.15400,0.00820,0.00972,0.01491,0.02460,0.01397,23.95800,0,0.566424,0.667654,-6.431119,0.153310,2.161936,0.120605
phon_R01_S49_5,116.34200,581.28900,94.24600,0.00267,0.00002,0.00115,0.00148,0.00345,0.01300,0.11700,0.00631,0.00789,0.01144,0.01892,0.00680,25.02300,0,0.528485,0.663884,-6.359018,0.116636,2.152083,0.138868
phon_R01_S49_6,114.56300,119.16700,86.64700,0.00327,0.00003,0.00146,0.00184,0.00439,0.01185,0.10600,0.00557,0.00721,0.01095,0.01672,0.00703,24.77500,0,0.555303,0.659132,-6.710219,0.149694,1.913990,0.121777
phon_R01_S50_1,201.77400,262.70700,78.22800,0.00694,0.00003,0.00412,0.00396,0.01235,0.02574,0.25500,0.01454,0.01582,0.01758,0.04363,0.04441,19.36800,0,0.508479,0.683761,-6.934474,0.159890,2.316346,0.112838
phon_R01_S50_2,174.18800,230.97800,94.26100,0.00459,0.00003,0.00263,0.00259,0.00790,0.04087,0.40500,0.02336,0.02498,0.02745,0.07008,0.02764,19.51700,0,0.448439,0.657899,-6.538586,0.121952,2.657476,0.133050
phon_R01_S50_3,209.51600,253.01700,89.48800,0.00564,0.00003,0.00331,0.00292,0.00994,0.02751,0.26300,0.01604,0.01657,0.01879,0.04812,0.01810,19.14700,0,0.431674,0.683244,-6.195325,0.129303,2.784312,0.168895
phon_R01_S50_4,174.68800,240.00500,74.28700,0.01360,0.00008,0.00624,0.00564,0.01873,0.02308,0.25600,0.01268,0.01365,0.01667,0.03804,0.10715,17.88300,0,0.407567,0.655683,-6.787197,0.158453,2.679772,0.131728
phon_R01_S50_5,198.76400,396.96100,74.90400,0.00740,0.00004,0.00370,0.00390,0.01109,0.02296,0.24100,0.01265,0.01321,0.01588,0.03794,0.07223,19.02000,0,0.451221,0.643956,-6.744577,0.207454,2.138608,0.123306
phon_R01_S50_6,214.28900,260.27700,77.97300,0.00567,0.00003,0.00295,0.00317,0.00885,0.01884,0.19000,0.01026,0.01161,0.01373,0.03078,0.04398,21.20900,0,0.462803,0.664357,-5.724056,0.190667,2.555477,0.148569
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._parkinsons.data
a3-2-1-zgpisn1p-3nmaq3oe/perceptron.py
from classifier import *
from datasets import *
from random import shuffle, seed
def dot_product(x, y):
'''Compute the dot product (scalar product) for two vectors.'''
return sum(a * b for a, b in zip(x, y))
class perceptron(classifier):
'''Simple implementation of a multiclass perceptron.

In a multiclass perceptron, there is one weight vector per class.
'''
def __init__(self, n_features, n_classes):
'''Initialize a multiclass perceptron with 'n_features'
dimensions and 'n_classes' output classifications.'''
self.n_features = n_features + 1 # additional bias term.
self.n_classes = n_classes
self.weights = []
for i in range(self.n_classes):
self.weights.append([0] * self.n_features)
def train(self, train_data, iterations = 1000):
'''Training phase for multiclass perceptron.'''
for iter in range(iterations):
n_errors = 0
for item in train_data:
pred_label, feature_vector = self.__predict(item.data)
if item.label != pred_label:
n_errors += 1
for i in range(self.n_features):
try:
self.weights[item.label][i] += feature_vector[i]
self.weights[pred_label][i] -= feature_vector[i]
except IndexError:
print(item.label, i, len(item.data))
if n_errors == 0:
break

def __predict(self, data):
'''Return the most likely class label for an unknown data point.'''
inputs = data + [1] # bias term
activations = [dot_product(inputs, self.weights[c]) for c in range(self.n_classes)]
predicted_label = argmax(activations)
return predicted_label, inputs
def predict(self, data):
result = self.__predict(data)
return result[0]
# Main program
if __name__ == "__main__":
seed(0) # Force RNG to known state.
iris_dataset = read_iris_dataset()
shuffle(iris_dataset)
wine_dataset = read_wine_dataset()
shuffle(wine_dataset)
wheat_dataset = read_seeds_dataset()
shuffle(wheat_dataset)
print("Testing the perceptron.")
print()
print("The iris dataset.")
print("With raw data:")
print("accuracy:", evaluate(iris_dataset, perceptron, 4, n_features=4, n_classes=3))
print("With normalized data:")
print("accuracy:", evaluate(normalize_dataset(iris_dataset), perceptron, 4, n_features=4, n_classes=3))
print()
print("The wheat dataset.")
print("With raw data:")
print("accuracy:", evaluate(wheat_dataset, perceptron, 4, n_features=7, n_classes=3))
print("With normalized data:")
print("accuracy:", evaluate(normalize_dataset(wheat_dataset), perceptron, 4, n_features=7, n_classes=3))
print()
print("The wine dataset.")
print("With raw data:")
print("accuracy:", evaluate(wine_dataset, perceptron, 4, n_features=13, n_classes=3))
print("With normalized data:")
print("accuracy:", evaluate(normalize_dataset(wine_dataset), perceptron, 4, n_features=13, n_classes=3))
print()
print("The xor dataset.")
xor = [data_item(0, [0.0, 0.0]),
data_item(1, [1.0, 0.0]),
data_item(1, [0.0, 1.0]),
data_item(0, [1.0, 1.0])]
p = perceptron(2, 2)
p.train(xor, 2000)
n_correct = 0
for item in xor:
label = p.predict(item.data)
if label == item.label:
n_correct += 1
print("accuracy: ", n_correct / len(xor))
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._perceptron.py
a3-2-1-zgpisn1p-3nmaq3oe/classifier.py
class data_item(object):
'''Class to represent a generic labeled
data item.'''
def __init__(self, label, data):
self.label = label
self.data = data
def normalize_dataset(dataset):
'''Normalize all features to lie within the interval [0, 1].'''
mins = list(dataset[0].data)
maxs = list(dataset[0].data)
m = len(mins)
for i in range(m):
features = [item.data[i] for item in dataset]
mins[i] = min(features)
maxs[i] = max(features)
result = []
for item in dataset:
norm_data = [(v - n) / (x - n) for v, n, x in zip(item.data, mins, maxs)]
result.append(data_item(item.label, norm_data))
return result

def distance(d1, d2):
'''n-dimensional distance between 'd1' and 'd2'.'''
assert len(d1) == len(d2)
r = 0
for x, y in zip(d1, d2):
d = x - y
r += d * d
return r
'''Very many machine-learning problems use the argmin and argmax
functions. These functions are widely used in optimization problems
of all sorts. The idea is simple - return the value of x for which the
function y = f(x) is a maximum. In many ML problems the values of x are
in a small discrete set, so it's easy to perform the computation as
done in the following functions.'''
def argmax(lst):
'''Return the index of the largest value in 'lst'.'''
return max(range(len(lst)), key=lambda i: lst[i])

def argmin(lst):
'''Return the index of the smallest value in 'lst'.'''
return min(range(len(lst)), key=lambda x: lst[x])
class classifier(object):
'''Generic interface for a classifier.'''
def __init__(self):
'''Initialize the generic classifier.'''
pass
def train(self, train_data):
'''Train a classifier using the 'train_data', which is a list of
data_item objects.'''
pass
def predict(self, data_point):
'''Given a new data point, return the most likely label
for that value.'''
pass
def evaluate(dataset, cls, n_folds = 0, **kwargs):
'''Evaluate the classifier on the dataset using 'n_folds' of
cross-validation. If n_folds is equal to zero, the code performs
leave-one-out cross-validation.'''
if n_folds == 0:
n_folds = len(dataset)
n_features = len(dataset[0].data)
test_size = round(len(dataset) / n_folds)
index = 0
n_correct = 0 # count correct predictions.
n_tested = 0
for fold in range(n_folds):
train_data = dataset[:index] + dataset[index + test_size:]
test_data = dataset[index:index + test_size]
p = cls(**kwargs)
p.train(train_data)
for item in test_data:
if p.predict(item.data) == item.label:
n_correct += 1
n_tested += 1
index += test_size
return n_correct / n_tested
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._classifier.py
a3-2-1-zgpisn1p-3nmaq3oe/spambase.py
from extra_trees import extra_trees # suggestion!
from classifier import data_item, normalize_dataset
from random import shuffle
fp = open('spambase.data')
dataset = []
for line in fp:
fields = line.split(',')
data = [float(x) for x in fields[:-1]]
label = int(fields[-1])
dataset.append(data_item(label, data))
print("Read {} items.".format(len(dataset)))
print("{} features per item.".format(len(dataset[0].data)))
# Add your code here...
# Question 1.
import pandas as pd
df = pd.read_csv('spambase/spambase.data', names=list(range(57)) + ['class'])
df.head()
df = df.fillna(0)
from sklearn.model_selection import train_test_split, cross_val_score
from sklearn.tree import DecisionTreeClassifier
X = df.iloc[:, 1:-1]
y = df.iloc[:, -1]
clf = DecisionTreeClassifier()
# Question 2.
scores = cross_val_score(clf, X, y, cv=5)
print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import confusion_matrix
# Question 3.
y_pred = cross_val_predict(clf, X, y, cv=5)
conf_mat = confusion_matrix(y, y_pred)
print ("Confusion matrix: \n", conf_mat)
# Question 4.
tn, fp, fn, tp = confusion_matrix(y, y_pred).ravel()
tpr = tp / (tp + fn)
tnr = tn / (tn + fp)
fpr = 1 - tnr
print ("TPR: ", tpr)
print ("FPR: ", fpr)
# Question 5.
from sklearn.metrics import accuracy_score, classification_report
#Train test split or single fold split
train_x, test_x, train_y, test_y = train_test_split(X, y, test_size=0.2)
depths = [2, 5, 10, 15]
criterion = ['gini', 'entropy']
for depth in depths:
for en_func in criterion:
print ("Depth: %d \nEntropy function: %s"%(depth, en_func))
clf = DecisionTreeClassifier(criterion=en_func, max_depth=depth)
clf.fit(train_x, train_y)
#Accuracy Score
print ("Accuracy Score: %f"%(accuracy_score(test_y, clf.predict(test_x))*100))
#Classification report
print ("Classification Report: ")
print(classification_report(test_y, clf.predict(test_x)))
# Question 6.
from sklearn.ensemble import RandomForestClassifier
clf = RandomForestClassifier()
scores = cross_val_score(clf, X, y, cv=5)
print("Accuracy: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2))
from sklearn.model_selection import cross_val_predict
from sklearn.metrics import confusion_matrix
y_pred = cross_val_predict(clf, X, y, cv=5)
conf_mat = confusion_matrix(y, y_pred)
print ("Confusion matrix: \n", conf_mat)
tn, fp, fn, tp = confusion_matrix(y, y_pred).ravel()
tpr = tp / (tp + fn)
tnr = tn / (tn + fp)
fpr = 1 - tnr
print ("TPR: ", tpr)
print ("FPR: ", fpr)
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._spambase.py
a3-2-1-zgpisn1p-3nmaq3oe/wine.txt
1,14.23,1.71,2.43,15.6,127,2.8,3.06,.28,2.29,5.64,1.04,3.92,1065
1,13.2,1.78,2.14,11.2,100,2.65,2.76,.26,1.28,4.38,1.05,3.4,1050
1,13.16,2.36,2.67,18.6,101,2.8,3.24,.3,2.81,5.68,1.03,3.17,1185
1,14.37,1.95,2.5,16.8,113,3.85,3.49,.24,2.18,7.8,.86,3.45,1480
1,13.24,2.59,2.87,21,118,2.8,2.69,.39,1.82,4.32,1.04,2.93,735
1,14.2,1.76,2.45,15.2,112,3.27,3.39,.34,1.97,6.75,1.05,2.85,1450
1,14.39,1.87,2.45,14.6,96,2.5,2.52,.3,1.98,5.25,1.02,3.58,1290
1,14.06,2.15,2.61,17.6,121,2.6,2.51,.31,1.25,5.05,1.06,3.58,1295
1,14.83,1.64,2.17,14,97,2.8,2.98,.29,1.98,5.2,1.08,2.85,1045
1,13.86,1.35,2.27,16,98,2.98,3.15,.22,1.85,7.22,1.01,3.55,1045
1,14.1,2.16,2.3,18,105,2.95,3.32,.22,2.38,5.75,1.25,3.17,1510
1,14.12,1.48,2.32,16.8,95,2.2,2.43,.26,1.57,5,1.17,2.82,1280
1,13.75,1.73,2.41,16,89,2.6,2.76,.29,1.81,5.6,1.15,2.9,1320
1,14.75,1.73,2.39,11.4,91,3.1,3.69,.43,2.81,5.4,1.25,2.73,1150
1,14.38,1.87,2.38,12,102,3.3,3.64,.29,2.96,7.5,1.2,3,1547
1,13.63,1.81,2.7,17.2,112,2.85,2.91,.3,1.46,7.3,1.28,2.88,1310
1,14.3,1.92,2.72,20,120,2.8,3.14,.33,1.97,6.2,1.07,2.65,1280
1,13.83,1.57,2.62,20,115,2.95,3.4,.4,1.72,6.6,1.13,2.57,1130
1,14.19,1.59,2.48,16.5,108,3.3,3.93,.32,1.86,8.7,1.23,2.82,1680
1,13.64,3.1,2.56,15.2,116,2.7,3.03,.17,1.66,5.1,.96,3.36,845
1,14.06,1.63,2.28,16,126,3,3.17,.24,2.1,5.65,1.09,3.71,780
1,12.93,3.8,2.65,18.6,102,2.41,2.41,.25,1.98,4.5,1.03,3.52,770
1,13.71,1.86,2.36,16.6,101,2.61,2.88,.27,1.69,3.8,1.11,4,1035
1,12.85,1.6,2.52,17.8,95,2.48,2.37,.26,1.46,3.93,1.09,3.63,1015
1,13.5,1.81,2.61,20,96,2.53,2.61,.28,1.66,3.52,1.12,3.82,845
1,13.05,2.05,3.22,25,124,2.63,2.68,.47,1.92,3.58,1.13,3.2,830
1,13.39,1.77,2.62,16.1,93,2.85,2.94,.34,1.45,4.8,.92,3.22,1195
1,13.3,1.72,2.14,17,94,2.4,2.19,.27,1.35,3.95,1.02,2.77,1285
1,13.87,1.9,2.8,19.4,107,2.95,2.97,.37,1.76,4.5,1.25,3.4,915
1,14.02,1.68,2.21,16,96,2.65,2.33,.26,1.98,4.7,1.04,3.59,1035
1,13.73,1.5,2.7,22.5,101,3,3.25,.29,2.38,5.7,1.19,2.71,1285
1,13.58,1.66,2.36,19.1,106,2.86,3.19,.22,1.95,6.9,1.09,2.88,1515
1,13.68,1.83,2.36,17.2,104,2.42,2.69,.42,1.97,3.84,1.23,2.87,990
1,13.76,1.53,2.7,19.5,132,2.95,2.74,.5,1.35,5.4,1.25,3,1235
1,13.51,1.8,2.65,19,110,2.35,2.53,.29,1.54,4.2,1.1,2.87,1095
1,13.48,1.81,2.41,20.5,100,2.7,2.98,.26,1.86,5.1,1.04,3.47,920
1,13.28,1.64,2.84,15.5,110,2.6,2.68,.34,1.36,4.6,1.09,2.78,880
1,13.05,1.65,2.55,18,98,2.45,2.43,.29,1.44,4.25,1.12,2.51,1105
1,13.07,1.5,2.1,15.5,98,2.4,2.64,.28,1.37,3.7,1.18,2.69,1020
1,14.22,3.99,2.51,13.2,128,3,3.04,.2,2.08,5.1,.89,3.53,760
1,13.56,1.71,2.31,16.2,117,3.15,3.29,.34,2.34,6.13,.95,3.38,795
1,13.41,3.84,2.12,18.8,90,2.45,2.68,.27,1.48,4.28,.91,3,1035
1,13.88,1.89,2.59,15,101,3.25,3.56,.17,1.7,5.43,.88,3.56,1095
1,13.24,3.98,2.29,17.5,103,2.64,2.63,.32,1.66,4.36,.82,3,680
1,13.05,1.77,2.1,17,107,3,3,.28,2.03,5.04,.88,3.35,885
1,14.21,4.04,2.44,18.9,111,2.85,2.65,.3,1.25,5.24,.87,3.33,1080
1,14.38,3.59,2.28,16,102,3.25,3.17,.27,2.19,4.9,1.04,3.44,1065
1,13.9,1.68,2.12,16,101,3.1,3.39,.21,2.14,6.1,.91,3.33,985
1,14.1,2.02,2.4,18.8,103,2.75,2.92,.32,2.38,6.2,1.07,2.75,1060
1,13.94,1.73,2.27,17.4,108,2.88,3.54,.32,2.08,8.90,1.12,3.1,1260
1,13.05,1.73,2.04,12.4,92,2.72,3.27,.17,2.91,7.2,1.12,2.91,1150
1,13.83,1.65,2.6,17.2,94,2.45,2.99,.22,2.29,5.6,1.24,3.37,1265
1,13.82,1.75,2.42,14,111,3.88,3.74,.32,1.87,7.05,1.01,3.26,1190
1,13.77,1.9,2.68,17.1,115,3,2.79,.39,1.68,6.3,1.13,2.93,1375
1,13.74,1.67,2.25,16.4,118,2.6,2.9,.21,1.62,5.85,.92,3.2,1060
1,13.56,1.73,2.46,20.5,116,2.96,2.78,.2,2.45,6.25,.98,3.03,1120
1,14.22,1.7,2.3,16.3,118,3.2,3,.26,2.03,6.38,.94,3.31,970
1,13.29,1.97,2.68,16.8,102,3,3.23,.31,1.66,6,1.07,2.84,1270
1,13.72,1.43,2.5,16.7,108,3.4,3.67,.19,2.04,6.8,.89,2.87,1285
2,12.37,.94,1.36,10.6,88,1.98,.57,.28,.42,1.95,1.05,1.82,520
2,12.33,1.1,2.28,16,101,2.05,1.09,.63,.41,3.27,1.25,1.67,680
2,12.64,1.36,2.02,16.8,100,2.02,1.41,.53,.62,5.75,.98,1.59,450
2,13.67,1.25,1.92,18,94,2.1,1.79,.32,.73,3.8,1.23,2.46,630
2,12.37,1.13,2.16,19,87,3.5,3.1,.19,1.87,4.45,1.22,2.87,420
2,12.17,1.45,2.53,19,104,1.89,1.75,.45,1.03,2.95,1.45,2.23,355
2,12.37,1.21,2.56,18.1,98,2.42,2.65,.37,2.08,4.6,1.19,2.3,678
2,13.11,1.01,1.7,15,78,2.98,3.18,.26,2.28,5.3,1.12,3.18,502
2,12.37,1.17,1.92,19.6,78,2.11,2,.27,1.04,4.68,1.12,3.48,510
2,13.34,.94,2.36,17,110,2.53,1.3,.55,.42,3.17,1.02,1.93,750
2,12.21,1.19,1.75,16.8,151,1.85,1.28,.14,2.5,2.85,1.28,3.07,718
2,12.29,1.61,2.21,20.4,103,1.1,1.02,.37,1.46,3.05,.906,1.82,870
2,13.86,1.51,2.67,25,86,2.95,2.86,.21,1.87,3.38,1.36,3.16,410
2,13.49,1.66,2.24,24,87,1.88,1.84,.27,1.03,3.74,.98,2.78,472
2,12.99,1.67,2.6,30,139,3.3,2.89,.21,1.96,3.35,1.31,3.5,985
2,11.96,1.09,2.3,21,101,3.38,2.14,.13,1.65,3.21,.99,3.13,886
2,11.66,1.88,1.92,16,97,1.61,1.57,.34,1.15,3.8,1.23,2.14,428
2,13.03,.9,1.71,16,86,1.95,2.03,.24,1.46,4.6,1.19,2.48,392
2,11.84,2.89,2.23,18,112,1.72,1.32,.43,.95,2.65,.96,2.52,500
2,12.33,.99,1.95,14.8,136,1.9,1.85,.35,2.76,3.4,1.06,2.31,750
2,12.7,3.87,2.4,23,101,2.83,2.55,.43,1.95,2.57,1.19,3.13,463
2,12,.92,2,19,86,2.42,2.26,.3,1.43,2.5,1.38,3.12,278
2,12.72,1.81,2.2,18.8,86,2.2,2.53,.26,1.77,3.9,1.16,3.14,714
2,12.08,1.13,2.51,24,78,2,1.58,.4,1.4,2.2,1.31,2.72,630
2,13.05,3.86,2.32,22.5,85,1.65,1.59,.61,1.62,4.8,.84,2.01,515
2,11.84,.89,2.58,18,94,2.2,2.21,.22,2.35,3.05,.79,3.08,520
2,12.67,.98,2.24,18,99,2.2,1.94,.3,1.46,2.62,1.23,3.16,450
2,12.16,1.61,2.31,22.8,90,1.78,1.69,.43,1.56,2.45,1.33,2.26,495
2,11.65,1.67,2.62,26,88,1.92,1.61,.4,1.34,2.6,1.36,3.21,562
2,11.64,2.06,2.46,21.6,84,1.95,1.69,.48,1.35,2.8,1,2.75,680
2,12.08,1.33,2.3,23.6,70,2.2,1.59,.42,1.38,1.74,1.07,3.21,625
2,12.08,1.83,2.32,18.5,81,1.6,1.5,.52,1.64,2.4,1.08,2.27,480
2,12,1.51,2.42,22,86,1.45,1.25,.5,1.63,3.6,1.05,2.65,450
2,12.69,1.53,2.26,20.7,80,1.38,1.46,.58,1.62,3.05,.96,2.06,495
2,12.29,2.83,2.22,18,88,2.45,2.25,.25,1.99,2.15,1.15,3.3,290
2,11.62,1.99,2.28,18,98,3.02,2.26,.17,1.35,3.25,1.16,2.96,345
2,12.47,1.52,2.2,19,162,2.5,2.27,.32,3.28,2.6,1.16,2.63,937
2,11.81,2.12,2.74,21.5,134,1.6,.99,.14,1.56,2.5,.95,2.26,625
2,12.29,1.41,1.98,16,85,2.55,2.5,.29,1.77,2.9,1.23,2.74,428
2,12.37,1.07,2.1,18.5,88,3.52,3.75,.24,1.95,4.5,1.04,2.77,660
2,12.29,3.17,2.21,18,88,2.85,2.99,.45,2.81,2.3,1.42,2.83,406
2,12.08,2.08,1.7,17.5,97,2.23,2.17,.26,1.4,3.3,1.27,2.96,710
2,12.6,1.34,1.9,18.5,88,1.45,1.36,.29,1.35,2.45,1.04,2.77,562
2,12.34,2.45,2.46,21,98,2.56,2.11,.34,1.31,2.8,.8,3.38,438
2,11.82,1.72,1.88,19.5,86,2.5,1.64,.37,1.42,2.06,.94,2.44,415
2,12.51,1.73,1.98,20.5,85,2.2,1.92,.32,1.48,2.94,1.04,3.57,672
2,12.42,2.55,2.27,22,90,1.68,1.84,.66,1.42,2.7,.86,3.3,315
2,12.25,1.73,2.12,19,80,1.65,2.03,.37,1.63,3.4,1,3.17,510
2,12.72,1.75,2.28,22.5,84,1.38,1.76,.48,1.63,3.3,.88,2.42,488
2,12.22,1.29,1.94,19,92,2.36,2.04,.39,2.08,2.7,.86,3.02,312
2,11.61,1.35,2.7,20,94,2.74,2.92,.29,2.49,2.65,.96,3.26,680
2,11.46,3.74,1.82,19.5,107,3.18,2.58,.24,3.58,2.9,.75,2.81,562
2,12.52,2.43,2.17,21,88,2.55,2.27,.26,1.22,2,.9,2.78,325
2,11.76,2.68,2.92,20,103,1.75,2.03,.6,1.05,3.8,1.23,2.5,607
2,11.41,.74,2.5,21,88,2.48,2.01,.42,1.44,3.08,1.1,2.31,434
2,12.08,1.39,2.5,22.5,84,2.56,2.29,.43,1.04,2.9,.93,3.19,385
2,11.03,1.51,2.2,21.5,85,2.46,2.17,.52,2.01,1.9,1.71,2.87,407
2,11.82,1.47,1.99,20.8,86,1.98,1.6,.3,1.53,1.95,.95,3.33,495
2,12.42,1.61,2.19,22.5,108,2,2.09,.34,1.61,2.06,1.06,2.96,345
2,12.77,3.43,1.98,16,80,1.63,1.25,.43,.83,3.4,.7,2.12,372
2,12,3.43,2,19,87,2,1.64,.37,1.87,1.28,.93,3.05,564
2,11.45,2.4,2.42,20,96,2.9,2.79,.32,1.83,3.25,.8,3.39,625
2,11.56,2.05,3.23,28.5,119,3.18,5.08,.47,1.87,6,.93,3.69,465
2,12.42,4.43,2.73,26.5,102,2.2,2.13,.43,1.71,2.08,.92,3.12,365
2,13.05,5.8,2.13,21.5,86,2.62,2.65,.3,2.01,2.6,.73,3.1,380
2,11.87,4.31,2.39,21,82,2.86,3.03,.21,2.91,2.8,.75,3.64,380
2,12.07,2.16,2.17,21,85,2.6,2.65,.37,1.35,2.76,.86,3.28,378
2,12.43,1.53,2.29,21.5,86,2.74,3.15,.39,1.77,3.94,.69,2.84,352
2,11.79,2.13,2.78,28.5,92,2.13,2.24,.58,1.76,3,.97,2.44,466
2,12.37,1.63,2.3,24.5,88,2.22,2.45,.4,1.9,2.12,.89,2.78,342
2,12.04,4.3,2.38,22,80,2.1,1.75,.42,1.35,2.6,.79,2.57,580
3,12.86,1.35,2.32,18,122,1.51,1.25,.21,.94,4.1,.76,1.29,630
3,12.88,2.99,2.4,20,104,1.3,1.22,.24,.83,5.4,.74,1.42,530
3,12.81,2.31,2.4,24,98,1.15,1.09,.27,.83,5.7,.66,1.36,560
3,12.7,3.55,2.36,21.5,106,1.7,1.2,.17,.84,5,.78,1.29,600
3,12.51,1.24,2.25,17.5,85,2,.58,.6,1.25,5.45,.75,1.51,650
3,12.6,2.46,2.2,18.5,94,1.62,.66,.63,.94,7.1,.73,1.58,695
3,12.25,4.72,2.54,21,89,1.38,.47,.53,.8,3.85,.75,1.27,720
3,12.53,5.51,2.64,25,96,1.79,.6,.63,1.1,5,.82,1.69,515
3,13.49,3.59,2.19,19.5,88,1.62,.48,.58,.88,5.7,.81,1.82,580
3,12.84,2.96,2.61,24,101,2.32,.6,.53,.81,4.92,.89,2.15,590
3,12.93,2.81,2.7,21,96,1.54,.5,.53,.75,4.6,.77,2.31,600
3,13.36,2.56,2.35,20,89,1.4,.5,.37,.64,5.6,.7,2.47,780
3,13.52,3.17,2.72,23.5,97,1.55,.52,.5,.55,4.35,.89,2.06,520
3,13.62,4.95,2.35,20,92,2,.8,.47,1.02,4.4,.91,2.05,550
3,12.25,3.88,2.2,18.5,112,1.38,.78,.29,1.14,8.21,.65,2,855
3,13.16,3.57,2.15,21,102,1.5,.55,.43,1.3,4,.6,1.68,830
3,13.88,5.04,2.23,20,80,.98,.34,.4,.68,4.9,.58,1.33,415
3,12.87,4.61,2.48,21.5,86,1.7,.65,.47,.86,7.65,.54,1.86,625
3,13.32,3.24,2.38,21.5,92,1.93,.76,.45,1.25,8.42,.55,1.62,650
3,13.08,3.9,2.36,21.5,113,1.41,1.39,.34,1.14,9.40,.57,1.33,550
3,13.5,3.12,2.62,24,123,1.4,1.57,.22,1.25,8.60,.59,1.3,500
3,12.79,2.67,2.48,22,112,1.48,1.36,.24,1.26,10.8,.48,1.47,480
3,13.11,1.9,2.75,25.5,116,2.2,1.28,.26,1.56,7.1,.61,1.33,425
3,13.23,3.3,2.28,18.5,98,1.8,.83,.61,1.87,10.52,.56,1.51,675
3,12.58,1.29,2.1,20,103,1.48,.58,.53,1.4,7.6,.58,1.55,640
3,13.17,5.19,2.32,22,93,1.74,.63,.61,1.55,7.9,.6,1.48,725
3,13.84,4.12,2.38,19.5,89,1.8,.83,.48,1.56,9.01,.57,1.64,480
3,12.45,3.03,2.64,27,97,1.9,.58,.63,1.14,7.5,.67,1.73,880
3,14.34,1.68,2.7,25,98,2.8,1.31,.53,2.7,13,.57,1.96,660
3,13.48,1.67,2.64,22.5,89,2.6,1.1,.52,2.29,11.75,.57,1.78,620
3,12.36,3.83,2.38,21,88,2.3,.92,.5,1.04,7.65,.56,1.58,520
3,13.69,3.26,2.54,20,107,1.83,.56,.5,.8,5.88,.96,1.82,680
3,12.85,3.27,2.58,22,106,1.65,.6,.6,.96,5.58,.87,2.11,570
3,12.96,3.45,2.35,18.5,106,1.39,.7,.4,.94,5.28,.68,1.75,675
3,13.78,2.76,2.3,22,90,1.35,.68,.41,1.03,9.58,.7,1.68,615
3,13.73,4.36,2.26,22.5,88,1.28,.47,.52,1.15,6.62,.78,1.75,520
3,13.45,3.7,2.6,23,111,1.7,.92,.43,1.46,10.68,.85,1.56,695
3,12.82,3.37,2.3,19.5,88,1.48,.66,.4,.97,10.26,.72,1.75,685
3,13.58,2.58,2.69,24.5,105,1.55,.84,.39,1.54,8.66,.74,1.8,750
3,13.4,4.6,2.86,25,112,1.98,.96,.27,1.11,8.5,.67,1.92,630
3,12.2,3.03,2.32,19,96,1.25,.49,.4,.73,5.5,.66,1.83,510
3,12.77,2.39,2.28,19.5,86,1.39,.51,.48,.64,9.899999,.57,1.63,470
3,14.16,2.51,2.48,20,91,1.68,.7,.44,1.24,9.7,.62,1.71,660
3,13.71,5.65,2.45,20.5,95,1.68,.61,.52,1.06,7.7,.64,1.74,740
3,13.4,3.91,2.48,23,102,1.8,.75,.43,1.41,7.3,.7,1.56,750
3,13.27,4.28,2.26,20,120,1.59,.69,.43,1.35,10.2,.59,1.56,835
3,13.17,2.59,2.37,20,120,1.65,.68,.53,1.46,9.3,.6,1.62,840
3,14.13,4.1,2.74,24.5,96,2.05,.76,.56,1.35,9.2,.61,1.6,560
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._wine.txt
a3-2-1-zgpisn1p-3nmaq3oe/bagging.py
'''Bagging is a general 'ensemble method' in machine learning. To avoid
overfitting the training data, we construct several classifiers using
different distributions of training examples.'''
from classifier import classifier
from decision_tree import greedy_decision_tree
from random import choice
def sample_with_replacement(lst):
'''Return a resampled data set based on 'lst'.'''
return [choice(lst) for _ in range(len(lst))]

class bagging_trees(classifier):
'''Implement a simple version of bagging trees, in a random forest
classifier.'''
def __init__(self, M = 21):
'''Initialize the empty forest.'''
self.forest = []
self.M = M
def predict(self, data_point):
'''Predict the class of the 'data_point' by majority vote.'''
c = sum(dt.predict(data_point) > 0 for dt in self.forest)
return c > (self.M - c)
def train(self, training_data):
'''Train a forest using the bagging trees algorithm.'''
for i in range(self.M):
dt = greedy_decision_tree()
dt.train(sample_with_replacement(training_data))
self.forest.append(dt)

if __name__ == "__main__":
# Basic testing code.
from datasets import read_parkinsons_dataset
from classifier import evaluate
from random import shuffle
dataset = read_parkinsons_dataset()
shuffle(dataset)
pct = evaluate(dataset, bagging_trees, 4, M = 49)
print('Bagging: {:.2%}'.format(pct))
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._bagging.py
a3-2-1-zgpisn1p-3nmaq3oe/extra_trees.py
from classifier import classifier
from decision_tree import decision_tree, dt_node, score_test, split
from bagging import bagging_trees
from random import sample, uniform
def majority_label(dataset):
'''Return the most common label in a dataset.'''
counts = {}
for item in dataset:
counts[item.label] = counts.get(item.label, 0) + 1
return max(counts.keys(), key=lambda x: counts[x])
class extra_tree(decision_tree):
'''Build a tree using the 'extremely randomized trees' algorithm.'''
def train(self, training_data, K = 10, Nmin = 2):
'''Train an individual extra tree.'''

def same_labels(node_data):
'''Return True if all labels in the training data
are the same.'''
label = node_data[0].label
return all(label == x.label for x in node_data[1:])

def non_constant_features(node_data):
'''Return a list of features that are not constant in the
training data.'''
indices = []
m = len(node_data[0].data)
# Check each feature
for i in range(m):
value = node_data[0].data[i]
# Compare all feature values to the first value.
if any(value != x.data[i] for x in node_data[1:]):
# Not constant, add it to the list.
indices.append(i)
return indices
def pick_random_split(node_data, indices):
'''Pick the best split of the K random splits generated.'''
max_score = -float('inf')
max_index = 0
max_value = 0
m = len(node_data[0].data)
for index in indices:
feature = [item.data[index] for item in node_data]
value = uniform(min(feature), max(feature))
score = score_test(node_data, index, value)
if score > max_score:
max_score = score
max_index = index
max_value = value
return (max_index, max_value)
def build_tree(node_data, K, Nmin):
'''Recursively build a tree using the extra tree algorithm.'''
node = dt_node()
n = len(node_data)
indices = non_constant_features(node_data)
if n < Nmin or len(indices) == 0 or same_labels(node_data):
node.label = majority_label(node_data)
else:
if len(indices) > K:
indices = sample(indices, K)
node.index, node.value = pick_random_split(node_data, indices)
left_data, right_data = split(node_data, node.index, node.value)
node.left = build_tree(left_data, K, Nmin)
node.right = build_tree(right_data, K, Nmin)
return node
self.root = build_tree(training_data, K, Nmin)

class extra_trees(bagging_trees):
'''Implement "Extremely Randomized trees", the random forest classifier
described in Geurts et al. 2006.'''
def __init__(self, M = 15, K = 10, Nmin = 2):
'''Initialize the empty forest for an
Extremely Randomized ("extra") tree classifier.'''
super().__init__(M)
self.K = K
self.Nmin = Nmin
def train(self, training_data):
'''Train a random forest using the 'extremely randomized'
tree algorithm.'''
for i in range(self.M):
dt = extra_tree()
dt.train(training_data, self.K, self.Nmin)
self.forest.append(dt)

if __name__ == "__main__":
from datasets import read_parkinsons_dataset
from classifier import evaluate
from random import shuffle
dataset = read_parkinsons_dataset()
shuffle(dataset)
m = len(dataset[0].data) # number of features
pct = evaluate(dataset, extra_trees, 4, M=99, K=int(m ** 0.5))
print('Extra Trees: {:.2%}'.format(pct))
__MACOSX/a3-2-1-zgpisn1p-3nmaq3oe/._extra_trees.py
a3-2-1-zgpisn1p-3nmaq3oe/datasets.py
# Simple code for loading some standard data sets from the UCI Machine
# Learning repository.
#
from classifier import data_item
def read_wine_dataset():
'''Return a list of data_item object representing the UCI Wine data.'''
dataset = []
fp = open('wine.txt')
for line in fp:
fields = line.split(',')
data = [float(v) for v in fields[1:]]
label = int(fields[0]) - 1
dataset.append(data_item(label, data))
fp.close()
return dataset
def read_iris_dataset():
'''Return a list of data_item object representing the UCI Iris data.'''
dataset = []
fp = open('iris.txt')
for line in fp:
if not line.startswith('#'):
fields = line.split()
data = [float(v) for v in fields[:-1]]
if fields[-1] == "Iris-setosa":
label = 0
elif fields[-1] == "Iris-versicolor":
label = 1
elif fields[-1] == "Iris-virginica":
label = 2
else:
raise ValueError("Illegal class name: " + fields[-1])
dataset.append(data_item(label, data))
fp.close()
return dataset
def read_seeds_dataset():
'''Return a list of data_item object representing the UCI Seeds data.'''
fp = open('seeds.txt')
dataset = []
for line in fp:
fields = line.split()
data = [float(v) for v in fields[:-1]]
label = int(fields[-1]) - 1
dataset.append(data_item(label, data))
return dataset
def read_parkinsons_dataset():
'''Return a list of data_item object representing the UCI Parkinson's
data.'''
fp = open('parkinsons.data')
dataset = []
header = fp.readline()
for line in fp:
fields = line.split(',')
label = int(fields[17])
data = [float(x) for x in (fields[1:17] + fields[17+1:])]
dataset.append(data_item(label, data))
fp.close()
return dataset
def read_datasets():
'''Return all four of the datasets we use into a single dictionary.'''
datasets = {}
datasets['Wine'] = read_wine_dataset()
datasets['Iris'] = read_iris_dataset()
datasets['Seeds'] = read_seeds_dataset()
datasets['Parkinsons'] = read_parkinsons_dataset()
return...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here