ValueError while fitting Decision Tree Classifier on a dataset
I have created features X and labels y for the dataset I am working on.
At this point, I want to train a random forest classifier on it but I am facing a ValueError while fitting the classifier on the training data: setting an array element with a sequence.
Below the X and y features and the error details:
X:
(array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-0.00050612, -0.00057967, -0.00035985, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 6.8139506e-08, -2.3837963e-05, -2.4622474e-05, ...,
3.1678758e-06, -2.4535689e-06, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
6.9306935e-07, -6.6020442e-07, 0.0000000e+00], dtype=float32),
array([-7.30260945e-05, -1.18022966e-04, -1.08280736e-04, ...,
8.83421380e-05, 4.97258679e-06, 0.00000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 2.3406714e-05, 3.1186773e-05, 4.9467826e-06, ...,
1.2180173e-07, -9.2944845e-08, 0.0000000e+00], dtype=float32),
array([ 1.1845550e-06, -1.6399191e-06, 2.5565218e-06, ...,
-8.7445065e-09, 5.9859917e-09, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
5.0694009e-08, -3.4546797e-08, 0.0000000e+00], dtype=float32),
array([ 1.5591205e-07, -1.5845627e-07, 1.5362870e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.1608539e-05,
8.2463991e-09, 0.0000000e+00], dtype=float32),
array([-3.6192148e-07, -1.4590451e-05, -5.3999561e-06, ...,
-1.9935460e-05, -3.4417746e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5319534e-07, 2.6521766e-07, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5055220e-08, 1.2936166e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.3387315e-05, 6.0913658e-07, -5.6471418e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.7200684e-02, 3.2272514e-02, 3.2961801e-02, ...,
-1.6286784e-06, -8.5592075e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-3.3923173e-11, 2.8026699e-11, 0.0000000e+00], dtype=float32),
array([-0.00103188, -0.00075814, -0.00051426, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 7.6278877e-07, 2.1624428e-05, 1.1150542e-05, ...,
1.8263392e-09, -1.5558380e-09, 0.0000000e+00], dtype=float32),
array([-1.2111740e-07, 6.3130176e-07, -1.8378003e-06, ...,
1.1309878e-05, 5.4562256e-06, 0.0000000e+00], dtype=float32),
array([0.00026949, 0.00028119, 0.00020081, ..., 0.00032586, 0.00046612,
0. ], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.8796054e-09, 1.7431153e-08, 0.0000000e+00], dtype=float32),
array([1.42000988e-06, 1.30781755e-05, 2.77493709e-05, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00], dtype=float32),
array([ 2.9161662e-10, -6.3629275e-11, -3.0565092e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.2051008e-05, 1.6838792e-05, 3.5639907e-05, ...,
4.5767497e-06, -1.2002213e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.0104826e-10, 1.6824393e-10, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-4.8303300e-06, -1.2008861e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.7673337e-07, 2.8604177e-07, 0.0000000e+00], dtype=float32),
array([-0.00066044, -0.0009837 , -0.00090796, ..., -0.00171516,
-0.0017666 , 0. ], dtype=float32),
array([ 3.2218946e-11, -5.5296181e-11, 8.9530647e-11, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 4.9886359e-05, 1.4642075e-04, 4.4365996e-04, ...,
6.3584002e-07, -6.2395281e-07, 0.0000000e+00], dtype=float32),
array([-3.2826196e-04, 4.5522624e-03, -8.2306744e-04, ...,
-2.2519816e-07, -6.2417300e-08, 0.0000000e+00], dtype=float32),
array([ 3.1686827e-04, 4.6282235e-04, 1.0160641e-04, ...,
-1.4605960e-05, 6.6572487e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.1763244e-09, -2.8297892e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.5870585e-07, 4.6514080e-07, -9.5607948e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 5.788035e-07, -6.493598e-07, 7.111379e-07, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00], dtype=float32),
array([ 2.5118000e-04, 1.4220485e-03, 3.9536849e-04, ...,
4.5242754e-04, -3.1405249e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.1985266e-07, 2.1360799e-07, -1.1951373e-06, ...,
-1.3043609e-04, 1.2107374e-06, 0.0000000e+00], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 2.5944988e-08,
1.2123945e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.4280996e-06, -1.2362683e-05, -8.5034850e-07, ...,
-1.0113516e-11, 5.1403621e-12, 0.0000000e+00], dtype=float32),
array([9.6098862e-05, 1.6449913e-04, 1.1942573e-04, ..., 0.0000000e+00,
0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.3284328e-05, 7.4090644e-07, -7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.4700081e-05, 2.9454704e-05, 8.0751715e-06, ...,
1.2746801e-07, -1.6574201e-06, 0.0000000e+00], dtype=float32),
array([8.4619669e-06, 9.7476968e-06, 2.0182479e-05, ..., 2.1081217e-11,
4.0220186e-10, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32))
y below
('08',
'08',
'06',
'05',
'05',
'04',
'06',
'07',
'01',
'04',
'03',
'07',
'03',
'01',
'03',
'03',
'02',
'02',
'02',
'02',
'05',
'06',
'04',
'08',
'07',
'06',
'04',
'05',
'07',
'02',
'08',
'01',
'08',
'03',
'08',
'02',
'03',
'06',
'04',
'07',
'04',
'07',
'05',
'06',
'08',
'08',
'04',
'05',
'05',
'04',
'06',
'07',
'05',
'07',
'01',
'06',
'02',
'02',
'03',
'03')
Code for the classifier plus the train/test split:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-b6417fbfb8de> in <module>()
1 from sklearn.tree import DecisionTreeClassifier
2 dtree = DecisionTreeClassifier()
----> 3 dtree.fit(X_train, y_train)
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
788 sample_weight=sample_weight,
789 check_input=check_input,
--> 790 X_idx_sorted=X_idx_sorted)
791 return self
792
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
114 random_state = check_random_state(self.random_state)
115 if check_input:
--> 116 X = check_array(X, dtype=DTYPE, accept_sparse="csc")
117 y = check_array(y, ensure_2d=False, dtype=None)
118 if issparse(X):
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
431 force_all_finite)
432 else:
--> 433 array = np.array(array, dtype=dtype, order=order, copy=copy)
434
435 if ensure_2d:
ValueError: setting an array element with a sequence.
EDIT1: I converted both X and y into numpy arrays but the error I am receiving is the same, details below
import numpy as np
X = np.asarray(X)
y = np.asarray(y)
X.shape, y.shape
Output:
((60,), (60,))
python machine-learning scikit-learn random-forest
|
show 1 more comment
I have created features X and labels y for the dataset I am working on.
At this point, I want to train a random forest classifier on it but I am facing a ValueError while fitting the classifier on the training data: setting an array element with a sequence.
Below the X and y features and the error details:
X:
(array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-0.00050612, -0.00057967, -0.00035985, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 6.8139506e-08, -2.3837963e-05, -2.4622474e-05, ...,
3.1678758e-06, -2.4535689e-06, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
6.9306935e-07, -6.6020442e-07, 0.0000000e+00], dtype=float32),
array([-7.30260945e-05, -1.18022966e-04, -1.08280736e-04, ...,
8.83421380e-05, 4.97258679e-06, 0.00000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 2.3406714e-05, 3.1186773e-05, 4.9467826e-06, ...,
1.2180173e-07, -9.2944845e-08, 0.0000000e+00], dtype=float32),
array([ 1.1845550e-06, -1.6399191e-06, 2.5565218e-06, ...,
-8.7445065e-09, 5.9859917e-09, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
5.0694009e-08, -3.4546797e-08, 0.0000000e+00], dtype=float32),
array([ 1.5591205e-07, -1.5845627e-07, 1.5362870e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.1608539e-05,
8.2463991e-09, 0.0000000e+00], dtype=float32),
array([-3.6192148e-07, -1.4590451e-05, -5.3999561e-06, ...,
-1.9935460e-05, -3.4417746e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5319534e-07, 2.6521766e-07, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5055220e-08, 1.2936166e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.3387315e-05, 6.0913658e-07, -5.6471418e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.7200684e-02, 3.2272514e-02, 3.2961801e-02, ...,
-1.6286784e-06, -8.5592075e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-3.3923173e-11, 2.8026699e-11, 0.0000000e+00], dtype=float32),
array([-0.00103188, -0.00075814, -0.00051426, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 7.6278877e-07, 2.1624428e-05, 1.1150542e-05, ...,
1.8263392e-09, -1.5558380e-09, 0.0000000e+00], dtype=float32),
array([-1.2111740e-07, 6.3130176e-07, -1.8378003e-06, ...,
1.1309878e-05, 5.4562256e-06, 0.0000000e+00], dtype=float32),
array([0.00026949, 0.00028119, 0.00020081, ..., 0.00032586, 0.00046612,
0. ], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.8796054e-09, 1.7431153e-08, 0.0000000e+00], dtype=float32),
array([1.42000988e-06, 1.30781755e-05, 2.77493709e-05, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00], dtype=float32),
array([ 2.9161662e-10, -6.3629275e-11, -3.0565092e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.2051008e-05, 1.6838792e-05, 3.5639907e-05, ...,
4.5767497e-06, -1.2002213e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.0104826e-10, 1.6824393e-10, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-4.8303300e-06, -1.2008861e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.7673337e-07, 2.8604177e-07, 0.0000000e+00], dtype=float32),
array([-0.00066044, -0.0009837 , -0.00090796, ..., -0.00171516,
-0.0017666 , 0. ], dtype=float32),
array([ 3.2218946e-11, -5.5296181e-11, 8.9530647e-11, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 4.9886359e-05, 1.4642075e-04, 4.4365996e-04, ...,
6.3584002e-07, -6.2395281e-07, 0.0000000e+00], dtype=float32),
array([-3.2826196e-04, 4.5522624e-03, -8.2306744e-04, ...,
-2.2519816e-07, -6.2417300e-08, 0.0000000e+00], dtype=float32),
array([ 3.1686827e-04, 4.6282235e-04, 1.0160641e-04, ...,
-1.4605960e-05, 6.6572487e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.1763244e-09, -2.8297892e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.5870585e-07, 4.6514080e-07, -9.5607948e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 5.788035e-07, -6.493598e-07, 7.111379e-07, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00], dtype=float32),
array([ 2.5118000e-04, 1.4220485e-03, 3.9536849e-04, ...,
4.5242754e-04, -3.1405249e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.1985266e-07, 2.1360799e-07, -1.1951373e-06, ...,
-1.3043609e-04, 1.2107374e-06, 0.0000000e+00], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 2.5944988e-08,
1.2123945e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.4280996e-06, -1.2362683e-05, -8.5034850e-07, ...,
-1.0113516e-11, 5.1403621e-12, 0.0000000e+00], dtype=float32),
array([9.6098862e-05, 1.6449913e-04, 1.1942573e-04, ..., 0.0000000e+00,
0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.3284328e-05, 7.4090644e-07, -7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.4700081e-05, 2.9454704e-05, 8.0751715e-06, ...,
1.2746801e-07, -1.6574201e-06, 0.0000000e+00], dtype=float32),
array([8.4619669e-06, 9.7476968e-06, 2.0182479e-05, ..., 2.1081217e-11,
4.0220186e-10, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32))
y below
('08',
'08',
'06',
'05',
'05',
'04',
'06',
'07',
'01',
'04',
'03',
'07',
'03',
'01',
'03',
'03',
'02',
'02',
'02',
'02',
'05',
'06',
'04',
'08',
'07',
'06',
'04',
'05',
'07',
'02',
'08',
'01',
'08',
'03',
'08',
'02',
'03',
'06',
'04',
'07',
'04',
'07',
'05',
'06',
'08',
'08',
'04',
'05',
'05',
'04',
'06',
'07',
'05',
'07',
'01',
'06',
'02',
'02',
'03',
'03')
Code for the classifier plus the train/test split:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-b6417fbfb8de> in <module>()
1 from sklearn.tree import DecisionTreeClassifier
2 dtree = DecisionTreeClassifier()
----> 3 dtree.fit(X_train, y_train)
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
788 sample_weight=sample_weight,
789 check_input=check_input,
--> 790 X_idx_sorted=X_idx_sorted)
791 return self
792
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
114 random_state = check_random_state(self.random_state)
115 if check_input:
--> 116 X = check_array(X, dtype=DTYPE, accept_sparse="csc")
117 y = check_array(y, ensure_2d=False, dtype=None)
118 if issparse(X):
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
431 force_all_finite)
432 else:
--> 433 array = np.array(array, dtype=dtype, order=order, copy=copy)
434
435 if ensure_2d:
ValueError: setting an array element with a sequence.
EDIT1: I converted both X and y into numpy arrays but the error I am receiving is the same, details below
import numpy as np
X = np.asarray(X)
y = np.asarray(y)
X.shape, y.shape
Output:
((60,), (60,))
python machine-learning scikit-learn random-forest
1
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
1
There is something wrong with your X or y. You should try first and report the result:import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
1
The problem is the X. Now just try:np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
1
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10
|
show 1 more comment
I have created features X and labels y for the dataset I am working on.
At this point, I want to train a random forest classifier on it but I am facing a ValueError while fitting the classifier on the training data: setting an array element with a sequence.
Below the X and y features and the error details:
X:
(array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-0.00050612, -0.00057967, -0.00035985, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 6.8139506e-08, -2.3837963e-05, -2.4622474e-05, ...,
3.1678758e-06, -2.4535689e-06, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
6.9306935e-07, -6.6020442e-07, 0.0000000e+00], dtype=float32),
array([-7.30260945e-05, -1.18022966e-04, -1.08280736e-04, ...,
8.83421380e-05, 4.97258679e-06, 0.00000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 2.3406714e-05, 3.1186773e-05, 4.9467826e-06, ...,
1.2180173e-07, -9.2944845e-08, 0.0000000e+00], dtype=float32),
array([ 1.1845550e-06, -1.6399191e-06, 2.5565218e-06, ...,
-8.7445065e-09, 5.9859917e-09, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
5.0694009e-08, -3.4546797e-08, 0.0000000e+00], dtype=float32),
array([ 1.5591205e-07, -1.5845627e-07, 1.5362870e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.1608539e-05,
8.2463991e-09, 0.0000000e+00], dtype=float32),
array([-3.6192148e-07, -1.4590451e-05, -5.3999561e-06, ...,
-1.9935460e-05, -3.4417746e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5319534e-07, 2.6521766e-07, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5055220e-08, 1.2936166e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.3387315e-05, 6.0913658e-07, -5.6471418e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.7200684e-02, 3.2272514e-02, 3.2961801e-02, ...,
-1.6286784e-06, -8.5592075e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-3.3923173e-11, 2.8026699e-11, 0.0000000e+00], dtype=float32),
array([-0.00103188, -0.00075814, -0.00051426, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 7.6278877e-07, 2.1624428e-05, 1.1150542e-05, ...,
1.8263392e-09, -1.5558380e-09, 0.0000000e+00], dtype=float32),
array([-1.2111740e-07, 6.3130176e-07, -1.8378003e-06, ...,
1.1309878e-05, 5.4562256e-06, 0.0000000e+00], dtype=float32),
array([0.00026949, 0.00028119, 0.00020081, ..., 0.00032586, 0.00046612,
0. ], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.8796054e-09, 1.7431153e-08, 0.0000000e+00], dtype=float32),
array([1.42000988e-06, 1.30781755e-05, 2.77493709e-05, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00], dtype=float32),
array([ 2.9161662e-10, -6.3629275e-11, -3.0565092e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.2051008e-05, 1.6838792e-05, 3.5639907e-05, ...,
4.5767497e-06, -1.2002213e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.0104826e-10, 1.6824393e-10, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-4.8303300e-06, -1.2008861e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.7673337e-07, 2.8604177e-07, 0.0000000e+00], dtype=float32),
array([-0.00066044, -0.0009837 , -0.00090796, ..., -0.00171516,
-0.0017666 , 0. ], dtype=float32),
array([ 3.2218946e-11, -5.5296181e-11, 8.9530647e-11, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 4.9886359e-05, 1.4642075e-04, 4.4365996e-04, ...,
6.3584002e-07, -6.2395281e-07, 0.0000000e+00], dtype=float32),
array([-3.2826196e-04, 4.5522624e-03, -8.2306744e-04, ...,
-2.2519816e-07, -6.2417300e-08, 0.0000000e+00], dtype=float32),
array([ 3.1686827e-04, 4.6282235e-04, 1.0160641e-04, ...,
-1.4605960e-05, 6.6572487e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.1763244e-09, -2.8297892e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.5870585e-07, 4.6514080e-07, -9.5607948e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 5.788035e-07, -6.493598e-07, 7.111379e-07, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00], dtype=float32),
array([ 2.5118000e-04, 1.4220485e-03, 3.9536849e-04, ...,
4.5242754e-04, -3.1405249e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.1985266e-07, 2.1360799e-07, -1.1951373e-06, ...,
-1.3043609e-04, 1.2107374e-06, 0.0000000e+00], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 2.5944988e-08,
1.2123945e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.4280996e-06, -1.2362683e-05, -8.5034850e-07, ...,
-1.0113516e-11, 5.1403621e-12, 0.0000000e+00], dtype=float32),
array([9.6098862e-05, 1.6449913e-04, 1.1942573e-04, ..., 0.0000000e+00,
0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.3284328e-05, 7.4090644e-07, -7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.4700081e-05, 2.9454704e-05, 8.0751715e-06, ...,
1.2746801e-07, -1.6574201e-06, 0.0000000e+00], dtype=float32),
array([8.4619669e-06, 9.7476968e-06, 2.0182479e-05, ..., 2.1081217e-11,
4.0220186e-10, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32))
y below
('08',
'08',
'06',
'05',
'05',
'04',
'06',
'07',
'01',
'04',
'03',
'07',
'03',
'01',
'03',
'03',
'02',
'02',
'02',
'02',
'05',
'06',
'04',
'08',
'07',
'06',
'04',
'05',
'07',
'02',
'08',
'01',
'08',
'03',
'08',
'02',
'03',
'06',
'04',
'07',
'04',
'07',
'05',
'06',
'08',
'08',
'04',
'05',
'05',
'04',
'06',
'07',
'05',
'07',
'01',
'06',
'02',
'02',
'03',
'03')
Code for the classifier plus the train/test split:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-b6417fbfb8de> in <module>()
1 from sklearn.tree import DecisionTreeClassifier
2 dtree = DecisionTreeClassifier()
----> 3 dtree.fit(X_train, y_train)
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
788 sample_weight=sample_weight,
789 check_input=check_input,
--> 790 X_idx_sorted=X_idx_sorted)
791 return self
792
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
114 random_state = check_random_state(self.random_state)
115 if check_input:
--> 116 X = check_array(X, dtype=DTYPE, accept_sparse="csc")
117 y = check_array(y, ensure_2d=False, dtype=None)
118 if issparse(X):
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
431 force_all_finite)
432 else:
--> 433 array = np.array(array, dtype=dtype, order=order, copy=copy)
434
435 if ensure_2d:
ValueError: setting an array element with a sequence.
EDIT1: I converted both X and y into numpy arrays but the error I am receiving is the same, details below
import numpy as np
X = np.asarray(X)
y = np.asarray(y)
X.shape, y.shape
Output:
((60,), (60,))
python machine-learning scikit-learn random-forest
I have created features X and labels y for the dataset I am working on.
At this point, I want to train a random forest classifier on it but I am facing a ValueError while fitting the classifier on the training data: setting an array element with a sequence.
Below the X and y features and the error details:
X:
(array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-0.00050612, -0.00057967, -0.00035985, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 6.8139506e-08, -2.3837963e-05, -2.4622474e-05, ...,
3.1678758e-06, -2.4535689e-06, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
6.9306935e-07, -6.6020442e-07, 0.0000000e+00], dtype=float32),
array([-7.30260945e-05, -1.18022966e-04, -1.08280736e-04, ...,
8.83421380e-05, 4.97258679e-06, 0.00000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 2.3406714e-05, 3.1186773e-05, 4.9467826e-06, ...,
1.2180173e-07, -9.2944845e-08, 0.0000000e+00], dtype=float32),
array([ 1.1845550e-06, -1.6399191e-06, 2.5565218e-06, ...,
-8.7445065e-09, 5.9859917e-09, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
5.0694009e-08, -3.4546797e-08, 0.0000000e+00], dtype=float32),
array([ 1.5591205e-07, -1.5845627e-07, 1.5362870e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 1.1608539e-05,
8.2463991e-09, 0.0000000e+00], dtype=float32),
array([-3.6192148e-07, -1.4590451e-05, -5.3999561e-06, ...,
-1.9935460e-05, -3.4417746e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5319534e-07, 2.6521766e-07, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.5055220e-08, 1.2936166e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.3387315e-05, 6.0913658e-07, -5.6471418e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.7200684e-02, 3.2272514e-02, 3.2961801e-02, ...,
-1.6286784e-06, -8.5592075e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-3.3923173e-11, 2.8026699e-11, 0.0000000e+00], dtype=float32),
array([-0.00103188, -0.00075814, -0.00051426, ..., 0. ,
0. , 0. ], dtype=float32),
array([ 7.6278877e-07, 2.1624428e-05, 1.1150542e-05, ...,
1.8263392e-09, -1.5558380e-09, 0.0000000e+00], dtype=float32),
array([-1.2111740e-07, 6.3130176e-07, -1.8378003e-06, ...,
1.1309878e-05, 5.4562256e-06, 0.0000000e+00], dtype=float32),
array([0.00026949, 0.00028119, 0.00020081, ..., 0.00032586, 0.00046612,
0. ], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.8796054e-09, 1.7431153e-08, 0.0000000e+00], dtype=float32),
array([1.42000988e-06, 1.30781755e-05, 2.77493709e-05, ...,
0.00000000e+00, 0.00000000e+00, 0.00000000e+00], dtype=float32),
array([ 2.9161662e-10, -6.3629275e-11, -3.0565092e-10, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.2051008e-05, 1.6838792e-05, 3.5639907e-05, ...,
4.5767497e-06, -1.2002213e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.0104826e-10, 1.6824393e-10, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-4.8303300e-06, -1.2008861e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-2.7673337e-07, 2.8604177e-07, 0.0000000e+00], dtype=float32),
array([-0.00066044, -0.0009837 , -0.00090796, ..., -0.00171516,
-0.0017666 , 0. ], dtype=float32),
array([ 3.2218946e-11, -5.5296181e-11, 8.9530647e-11, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-1.3284328e-05, -7.4090644e-07, 7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 4.9886359e-05, 1.4642075e-04, 4.4365996e-04, ...,
6.3584002e-07, -6.2395281e-07, 0.0000000e+00], dtype=float32),
array([-3.2826196e-04, 4.5522624e-03, -8.2306744e-04, ...,
-2.2519816e-07, -6.2417300e-08, 0.0000000e+00], dtype=float32),
array([ 3.1686827e-04, 4.6282235e-04, 1.0160641e-04, ...,
-1.4605960e-05, 6.6572487e-05, 0.0000000e+00], dtype=float32),
array([ 0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ...,
-7.1763244e-09, -2.8297892e-08, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.5870585e-07, 4.6514080e-07, -9.5607948e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 5.788035e-07, -6.493598e-07, 7.111379e-07, ..., 0.000000e+00,
0.000000e+00, 0.000000e+00], dtype=float32),
array([ 2.5118000e-04, 1.4220485e-03, 3.9536849e-04, ...,
4.5242754e-04, -3.1405249e-05, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([ 1.1985266e-07, 2.1360799e-07, -1.1951373e-06, ...,
-1.3043609e-04, 1.2107374e-06, 0.0000000e+00], dtype=float32),
array([0.0000000e+00, 0.0000000e+00, 0.0000000e+00, ..., 2.5944988e-08,
1.2123945e-07, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32),
array([-2.4280996e-06, -1.2362683e-05, -8.5034850e-07, ...,
-1.0113516e-11, 5.1403621e-12, 0.0000000e+00], dtype=float32),
array([9.6098862e-05, 1.6449913e-04, 1.1942573e-04, ..., 0.0000000e+00,
0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 1.3284328e-05, 7.4090644e-07, -7.2679302e-07, ...,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype=float32),
array([ 2.4700081e-05, 2.9454704e-05, 8.0751715e-06, ...,
1.2746801e-07, -1.6574201e-06, 0.0000000e+00], dtype=float32),
array([8.4619669e-06, 9.7476968e-06, 2.0182479e-05, ..., 2.1081217e-11,
4.0220186e-10, 0.0000000e+00], dtype=float32),
array([0., 0., 0., ..., 0., 0., 0.], dtype=float32))
y below
('08',
'08',
'06',
'05',
'05',
'04',
'06',
'07',
'01',
'04',
'03',
'07',
'03',
'01',
'03',
'03',
'02',
'02',
'02',
'02',
'05',
'06',
'04',
'08',
'07',
'06',
'04',
'05',
'07',
'02',
'08',
'01',
'08',
'03',
'08',
'02',
'03',
'06',
'04',
'07',
'04',
'07',
'05',
'06',
'08',
'08',
'04',
'05',
'05',
'04',
'06',
'07',
'05',
'07',
'01',
'06',
'02',
'02',
'03',
'03')
Code for the classifier plus the train/test split:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
from sklearn.tree import DecisionTreeClassifier
dtree = DecisionTreeClassifier()
dtree.fit(X_train, y_train)
Error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-b6417fbfb8de> in <module>()
1 from sklearn.tree import DecisionTreeClassifier
2 dtree = DecisionTreeClassifier()
----> 3 dtree.fit(X_train, y_train)
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
788 sample_weight=sample_weight,
789 check_input=check_input,
--> 790 X_idx_sorted=X_idx_sorted)
791 return self
792
/usr/local/lib/python3.6/dist-packages/sklearn/tree/tree.py in fit(self, X, y, sample_weight, check_input, X_idx_sorted)
114 random_state = check_random_state(self.random_state)
115 if check_input:
--> 116 X = check_array(X, dtype=DTYPE, accept_sparse="csc")
117 y = check_array(y, ensure_2d=False, dtype=None)
118 if issparse(X):
/usr/local/lib/python3.6/dist-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
431 force_all_finite)
432 else:
--> 433 array = np.array(array, dtype=dtype, order=order, copy=copy)
434
435 if ensure_2d:
ValueError: setting an array element with a sequence.
EDIT1: I converted both X and y into numpy arrays but the error I am receiving is the same, details below
import numpy as np
X = np.asarray(X)
y = np.asarray(y)
X.shape, y.shape
Output:
((60,), (60,))
python machine-learning scikit-learn random-forest
python machine-learning scikit-learn random-forest
edited Nov 17 '18 at 16:05
Marco G. de Pinto
asked Nov 17 '18 at 13:41
Marco G. de PintoMarco G. de Pinto
1701215
1701215
1
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
1
There is something wrong with your X or y. You should try first and report the result:import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
1
The problem is the X. Now just try:np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
1
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10
|
show 1 more comment
1
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
1
There is something wrong with your X or y. You should try first and report the result:import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
1
The problem is the X. Now just try:np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
1
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10
1
1
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
1
1
There is something wrong with your X or y. You should try first and report the result:
import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
There is something wrong with your X or y. You should try first and report the result:
import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
1
1
The problem is the X. Now just try:
np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
The problem is the X. Now just try:
np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
1
1
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10
|
show 1 more comment
1 Answer
1
active
oldest
votes
It appears that the problem is your X. Probably one of the arrays constituting it has a different length, that causes the tuple that you have build, and that is transformed into a Numpy array by Scikit-learn when processed by the DecisionTreeClassifier, to transform into a vector of strings, which are not what the decision tree function expects to process.
Just check this code snippet:
X1 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
X2 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0., 1], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
print("X1:", np.array(X1).dtype, "nX2:", np.array(X2).dtype)
By just changing the second element of X2 with the addition of a further number causes the X2 array to turn into a string array (object type).
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53351781%2fvalueerror-while-fitting-decision-tree-classifier-on-a-dataset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
It appears that the problem is your X. Probably one of the arrays constituting it has a different length, that causes the tuple that you have build, and that is transformed into a Numpy array by Scikit-learn when processed by the DecisionTreeClassifier, to transform into a vector of strings, which are not what the decision tree function expects to process.
Just check this code snippet:
X1 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
X2 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0., 1], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
print("X1:", np.array(X1).dtype, "nX2:", np.array(X2).dtype)
By just changing the second element of X2 with the addition of a further number causes the X2 array to turn into a string array (object type).
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
add a comment |
It appears that the problem is your X. Probably one of the arrays constituting it has a different length, that causes the tuple that you have build, and that is transformed into a Numpy array by Scikit-learn when processed by the DecisionTreeClassifier, to transform into a vector of strings, which are not what the decision tree function expects to process.
Just check this code snippet:
X1 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
X2 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0., 1], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
print("X1:", np.array(X1).dtype, "nX2:", np.array(X2).dtype)
By just changing the second element of X2 with the addition of a further number causes the X2 array to turn into a string array (object type).
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
add a comment |
It appears that the problem is your X. Probably one of the arrays constituting it has a different length, that causes the tuple that you have build, and that is transformed into a Numpy array by Scikit-learn when processed by the DecisionTreeClassifier, to transform into a vector of strings, which are not what the decision tree function expects to process.
Just check this code snippet:
X1 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
X2 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0., 1], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
print("X1:", np.array(X1).dtype, "nX2:", np.array(X2).dtype)
By just changing the second element of X2 with the addition of a further number causes the X2 array to turn into a string array (object type).
It appears that the problem is your X. Probably one of the arrays constituting it has a different length, that causes the tuple that you have build, and that is transformed into a Numpy array by Scikit-learn when processed by the DecisionTreeClassifier, to transform into a vector of strings, which are not what the decision tree function expects to process.
Just check this code snippet:
X1 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
X2 = (array([-8.1530527e-10, 8.9952795e-10, -9.1185753e-10,
0.0000000e+00, 0.0000000e+00, 0.0000000e+00], dtype='float32'),
array([0., 0., 0., 0., 0., 0., 1], dtype='float32'),
array([0., 0., 0., 0., 0., 0.], dtype='float32'))
print("X1:", np.array(X1).dtype, "nX2:", np.array(X2).dtype)
By just changing the second element of X2 with the addition of a further number causes the X2 array to turn into a string array (object type).
answered Nov 17 '18 at 16:13
Luca MassaronLuca Massaron
670415
670415
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
add a comment |
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
1
1
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
Thank you Luca!
– Marco G. de Pinto
Nov 17 '18 at 16:21
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53351781%2fvalueerror-while-fitting-decision-tree-classifier-on-a-dataset%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
check out this answer: stackoverflow.com/questions/36115472/…
– Tyson
Nov 17 '18 at 13:55
1
There is something wrong with your X or y. You should try first and report the result:
import numpy as np
X = np.array(X)
print(X.shape)
y = np.array(y)
print(y.shape)
– Luca Massaron
Nov 17 '18 at 16:01
I was trying exactly that and this is the outcome after the conversion of both X and y in numpy arrays: X.shape, y.shape -> ((60,), (60,)),
– Marco G. de Pinto
Nov 17 '18 at 16:03
1
The problem is the X. Now just try:
np.array(X).dtype
– Luca Massaron
Nov 17 '18 at 16:07
1
You X is a sequence of strings, that's the problem. You have to check it carefully because or there is a string in it or some of the arrays you put it has a different length than the others. I will post an answer for you.
– Luca Massaron
Nov 17 '18 at 16:10