Too many indices error, in my already micro dataset












0














This is my small test CSV file, because I'm trying to isolate the error.



Destination Port    Flow Duration   Total Fwd Packets   Labels
22 20 100 BENIGN
21 30 200 BENIGN
43 30 100 Bot
15 11 203 Bot


So when I run the following code, to preprocess the data,



import csv
# import itertools
import os
from os.path import join as os_join
import numpy as np


dataroot = r'D:DDOStesting_too_many_indices'

attacks = {'BENIGN': 0, 'Bot': 1}
# and many others

features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']

np.random.seed(1202)


def get_filenames(a_dir):
return [name for name in os.listdir(a_dir)
if os.path.isfile(os.path.join(a_dir, name))]


def read_csv(filename, sampling_rate=100):
with open(filename) as csv_file:
print('Reading {} ...'.format(filename))
reader = csv.reader(csv_file)
# header = next(reader)
data = [row for row in reader]
print('#{} rows read'.format(len(data)))
N = len(data)
sample_size = N*sampling_rate//100
indices = np.random.randint(0, N, sample_size)
sampled_data = [data[i] for i in indices]
return sampled_data


def read_data(dataroot, sampling_rate=10, seed=0):
np.random.seed(seed)
filenames = get_filenames(dataroot)
data =
for filename in filenames:
data_part = read_csv(os_join(dataroot, filename), sampling_rate)
data += data_part
return data


data = read_data(dataroot, sampling_rate=10)

arr = np.array(data)
# X contains feature values and Y the output values.
X = arr[:, :-1].astype(np.float32) # ERROR HERE.
Y_str = arr[:, -1]


I get the error as follows:-



File "/some_path/temp.py", line 63, in <module>
X = arr[:, :-1].astype(np.float32)

IndexError: too many indices for array


What should I do? I've wracked my brains on other possibly duplicate links like this one, and this one, etc. Also made the above sample snippet to test in isolation, read about 2D slicing and yet am clueless. Can you help me?










share|improve this question






















  • That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
    – Rudolf Morkovskyi
    Nov 11 at 12:08










  • You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
    – hpaulj
    Nov 11 at 13:12
















0














This is my small test CSV file, because I'm trying to isolate the error.



Destination Port    Flow Duration   Total Fwd Packets   Labels
22 20 100 BENIGN
21 30 200 BENIGN
43 30 100 Bot
15 11 203 Bot


So when I run the following code, to preprocess the data,



import csv
# import itertools
import os
from os.path import join as os_join
import numpy as np


dataroot = r'D:DDOStesting_too_many_indices'

attacks = {'BENIGN': 0, 'Bot': 1}
# and many others

features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']

np.random.seed(1202)


def get_filenames(a_dir):
return [name for name in os.listdir(a_dir)
if os.path.isfile(os.path.join(a_dir, name))]


def read_csv(filename, sampling_rate=100):
with open(filename) as csv_file:
print('Reading {} ...'.format(filename))
reader = csv.reader(csv_file)
# header = next(reader)
data = [row for row in reader]
print('#{} rows read'.format(len(data)))
N = len(data)
sample_size = N*sampling_rate//100
indices = np.random.randint(0, N, sample_size)
sampled_data = [data[i] for i in indices]
return sampled_data


def read_data(dataroot, sampling_rate=10, seed=0):
np.random.seed(seed)
filenames = get_filenames(dataroot)
data =
for filename in filenames:
data_part = read_csv(os_join(dataroot, filename), sampling_rate)
data += data_part
return data


data = read_data(dataroot, sampling_rate=10)

arr = np.array(data)
# X contains feature values and Y the output values.
X = arr[:, :-1].astype(np.float32) # ERROR HERE.
Y_str = arr[:, -1]


I get the error as follows:-



File "/some_path/temp.py", line 63, in <module>
X = arr[:, :-1].astype(np.float32)

IndexError: too many indices for array


What should I do? I've wracked my brains on other possibly duplicate links like this one, and this one, etc. Also made the above sample snippet to test in isolation, read about 2D slicing and yet am clueless. Can you help me?










share|improve this question






















  • That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
    – Rudolf Morkovskyi
    Nov 11 at 12:08










  • You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
    – hpaulj
    Nov 11 at 13:12














0












0








0







This is my small test CSV file, because I'm trying to isolate the error.



Destination Port    Flow Duration   Total Fwd Packets   Labels
22 20 100 BENIGN
21 30 200 BENIGN
43 30 100 Bot
15 11 203 Bot


So when I run the following code, to preprocess the data,



import csv
# import itertools
import os
from os.path import join as os_join
import numpy as np


dataroot = r'D:DDOStesting_too_many_indices'

attacks = {'BENIGN': 0, 'Bot': 1}
# and many others

features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']

np.random.seed(1202)


def get_filenames(a_dir):
return [name for name in os.listdir(a_dir)
if os.path.isfile(os.path.join(a_dir, name))]


def read_csv(filename, sampling_rate=100):
with open(filename) as csv_file:
print('Reading {} ...'.format(filename))
reader = csv.reader(csv_file)
# header = next(reader)
data = [row for row in reader]
print('#{} rows read'.format(len(data)))
N = len(data)
sample_size = N*sampling_rate//100
indices = np.random.randint(0, N, sample_size)
sampled_data = [data[i] for i in indices]
return sampled_data


def read_data(dataroot, sampling_rate=10, seed=0):
np.random.seed(seed)
filenames = get_filenames(dataroot)
data =
for filename in filenames:
data_part = read_csv(os_join(dataroot, filename), sampling_rate)
data += data_part
return data


data = read_data(dataroot, sampling_rate=10)

arr = np.array(data)
# X contains feature values and Y the output values.
X = arr[:, :-1].astype(np.float32) # ERROR HERE.
Y_str = arr[:, -1]


I get the error as follows:-



File "/some_path/temp.py", line 63, in <module>
X = arr[:, :-1].astype(np.float32)

IndexError: too many indices for array


What should I do? I've wracked my brains on other possibly duplicate links like this one, and this one, etc. Also made the above sample snippet to test in isolation, read about 2D slicing and yet am clueless. Can you help me?










share|improve this question













This is my small test CSV file, because I'm trying to isolate the error.



Destination Port    Flow Duration   Total Fwd Packets   Labels
22 20 100 BENIGN
21 30 200 BENIGN
43 30 100 Bot
15 11 203 Bot


So when I run the following code, to preprocess the data,



import csv
# import itertools
import os
from os.path import join as os_join
import numpy as np


dataroot = r'D:DDOStesting_too_many_indices'

attacks = {'BENIGN': 0, 'Bot': 1}
# and many others

features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']

np.random.seed(1202)


def get_filenames(a_dir):
return [name for name in os.listdir(a_dir)
if os.path.isfile(os.path.join(a_dir, name))]


def read_csv(filename, sampling_rate=100):
with open(filename) as csv_file:
print('Reading {} ...'.format(filename))
reader = csv.reader(csv_file)
# header = next(reader)
data = [row for row in reader]
print('#{} rows read'.format(len(data)))
N = len(data)
sample_size = N*sampling_rate//100
indices = np.random.randint(0, N, sample_size)
sampled_data = [data[i] for i in indices]
return sampled_data


def read_data(dataroot, sampling_rate=10, seed=0):
np.random.seed(seed)
filenames = get_filenames(dataroot)
data =
for filename in filenames:
data_part = read_csv(os_join(dataroot, filename), sampling_rate)
data += data_part
return data


data = read_data(dataroot, sampling_rate=10)

arr = np.array(data)
# X contains feature values and Y the output values.
X = arr[:, :-1].astype(np.float32) # ERROR HERE.
Y_str = arr[:, -1]


I get the error as follows:-



File "/some_path/temp.py", line 63, in <module>
X = arr[:, :-1].astype(np.float32)

IndexError: too many indices for array


What should I do? I've wracked my brains on other possibly duplicate links like this one, and this one, etc. Also made the above sample snippet to test in isolation, read about 2D slicing and yet am clueless. Can you help me?







python numpy slice data-processing






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 11 at 10:32









venom8914

178219




178219












  • That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
    – Rudolf Morkovskyi
    Nov 11 at 12:08










  • You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
    – hpaulj
    Nov 11 at 13:12


















  • That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
    – Rudolf Morkovskyi
    Nov 11 at 12:08










  • You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
    – hpaulj
    Nov 11 at 13:12
















That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08




That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08












You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12




You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

















active

oldest

votes











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247863%2ftoo-many-indices-error-in-my-already-micro-dataset%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown






























active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247863%2ftoo-many-indices-error-in-my-already-micro-dataset%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







這個網誌中的熱門文章

Hercules Kyvelos

Tangent Lines Diagram Along Smooth Curve

Yusuf al-Mu'taman ibn Hud