Too many indices error, in my already micro dataset

This is my small test CSV file, because I'm trying to isolate the error.

Destination Port    Flow Duration   Total Fwd Packets   Labels

22  20  100 BENIGN

21  30  200 BENIGN

43  30  100 Bot

15  11  203 Bot

So when I run the following code, to preprocess the data,

import csv

# import itertools

import os

from os.path import join as os_join

import numpy as np





dataroot = r'D:DDOStesting_too_many_indices'



attacks = {'BENIGN': 0, 'Bot': 1}

# and many others



features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']



np.random.seed(1202)





def get_filenames(a_dir):

    return [name for name in os.listdir(a_dir)

            if os.path.isfile(os.path.join(a_dir, name))]





def read_csv(filename, sampling_rate=100):

    with open(filename) as csv_file:

        print('Reading {} ...'.format(filename))

        reader = csv.reader(csv_file)

        # header = next(reader)

        data = [row for row in reader]

        print('#{} rows read'.format(len(data)))

        N = len(data)

        sample_size = N*sampling_rate//100

        indices = np.random.randint(0, N, sample_size)

        sampled_data = [data[i] for i in indices]

    return sampled_data





def read_data(dataroot, sampling_rate=10, seed=0):

    np.random.seed(seed)

    filenames = get_filenames(dataroot)

    data = 

    for filename in filenames:

        data_part = read_csv(os_join(dataroot, filename), sampling_rate)

        data += data_part

    return data





data = read_data(dataroot, sampling_rate=10)



arr = np.array(data)

# X contains feature values and Y the output values.

X = arr[:, :-1].astype(np.float32)  # ERROR HERE.

Y_str = arr[:, -1]

I get the error as follows:-

File "/some_path/temp.py", line 63, in <module>

    X = arr[:, :-1].astype(np.float32)



IndexError: too many indices for array

What should I do? I've wracked my brains on other possibly duplicate links like this one, and this one, etc. Also made the above sample snippet to test in isolation, read about 2D slicing and yet am clueless. Can you help me?

asked Nov 11 at 10:32

venom8914

178219

That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08

You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

add a comment |

This is my small test CSV file, because I'm trying to isolate the error.

Destination Port    Flow Duration   Total Fwd Packets   Labels

22  20  100 BENIGN

21  30  200 BENIGN

43  30  100 Bot

15  11  203 Bot

So when I run the following code, to preprocess the data,

import csv

# import itertools

import os

from os.path import join as os_join

import numpy as np





dataroot = r'D:DDOStesting_too_many_indices'



attacks = {'BENIGN': 0, 'Bot': 1}

# and many others



features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']



np.random.seed(1202)





def get_filenames(a_dir):

    return [name for name in os.listdir(a_dir)

            if os.path.isfile(os.path.join(a_dir, name))]





def read_csv(filename, sampling_rate=100):

    with open(filename) as csv_file:

        print('Reading {} ...'.format(filename))

        reader = csv.reader(csv_file)

        # header = next(reader)

        data = [row for row in reader]

        print('#{} rows read'.format(len(data)))

        N = len(data)

        sample_size = N*sampling_rate//100

        indices = np.random.randint(0, N, sample_size)

        sampled_data = [data[i] for i in indices]

    return sampled_data





def read_data(dataroot, sampling_rate=10, seed=0):

    np.random.seed(seed)

    filenames = get_filenames(dataroot)

    data = 

    for filename in filenames:

        data_part = read_csv(os_join(dataroot, filename), sampling_rate)

        data += data_part

    return data





data = read_data(dataroot, sampling_rate=10)



arr = np.array(data)

# X contains feature values and Y the output values.

X = arr[:, :-1].astype(np.float32)  # ERROR HERE.

Y_str = arr[:, -1]

I get the error as follows:-

File "/some_path/temp.py", line 63, in <module>

    X = arr[:, :-1].astype(np.float32)



IndexError: too many indices for array

asked Nov 11 at 10:32

venom8914

178219

That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08

You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

add a comment |

This is my small test CSV file, because I'm trying to isolate the error.

Destination Port    Flow Duration   Total Fwd Packets   Labels

22  20  100 BENIGN

21  30  200 BENIGN

43  30  100 Bot

15  11  203 Bot

So when I run the following code, to preprocess the data,

import csv

# import itertools

import os

from os.path import join as os_join

import numpy as np





dataroot = r'D:DDOStesting_too_many_indices'



attacks = {'BENIGN': 0, 'Bot': 1}

# and many others



features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']



np.random.seed(1202)





def get_filenames(a_dir):

    return [name for name in os.listdir(a_dir)

            if os.path.isfile(os.path.join(a_dir, name))]





def read_csv(filename, sampling_rate=100):

    with open(filename) as csv_file:

        print('Reading {} ...'.format(filename))

        reader = csv.reader(csv_file)

        # header = next(reader)

        data = [row for row in reader]

        print('#{} rows read'.format(len(data)))

        N = len(data)

        sample_size = N*sampling_rate//100

        indices = np.random.randint(0, N, sample_size)

        sampled_data = [data[i] for i in indices]

    return sampled_data





def read_data(dataroot, sampling_rate=10, seed=0):

    np.random.seed(seed)

    filenames = get_filenames(dataroot)

    data = 

    for filename in filenames:

        data_part = read_csv(os_join(dataroot, filename), sampling_rate)

        data += data_part

    return data





data = read_data(dataroot, sampling_rate=10)



arr = np.array(data)

# X contains feature values and Y the output values.

X = arr[:, :-1].astype(np.float32)  # ERROR HERE.

Y_str = arr[:, -1]

I get the error as follows:-

File "/some_path/temp.py", line 63, in <module>

    X = arr[:, :-1].astype(np.float32)



IndexError: too many indices for array

asked Nov 11 at 10:32

venom8914

178219

This is my small test CSV file, because I'm trying to isolate the error.

Destination Port    Flow Duration   Total Fwd Packets   Labels

22  20  100 BENIGN

21  30  200 BENIGN

43  30  100 Bot

15  11  203 Bot

So when I run the following code, to preprocess the data,

import csv

# import itertools

import os

from os.path import join as os_join

import numpy as np





dataroot = r'D:DDOStesting_too_many_indices'



attacks = {'BENIGN': 0, 'Bot': 1}

# and many others



features = ['Destination Port', 'Flow Duration', 'Total Fwd Packets']



np.random.seed(1202)





def get_filenames(a_dir):

    return [name for name in os.listdir(a_dir)

            if os.path.isfile(os.path.join(a_dir, name))]





def read_csv(filename, sampling_rate=100):

    with open(filename) as csv_file:

        print('Reading {} ...'.format(filename))

        reader = csv.reader(csv_file)

        # header = next(reader)

        data = [row for row in reader]

        print('#{} rows read'.format(len(data)))

        N = len(data)

        sample_size = N*sampling_rate//100

        indices = np.random.randint(0, N, sample_size)

        sampled_data = [data[i] for i in indices]

    return sampled_data





def read_data(dataroot, sampling_rate=10, seed=0):

    np.random.seed(seed)

    filenames = get_filenames(dataroot)

    data = 

    for filename in filenames:

        data_part = read_csv(os_join(dataroot, filename), sampling_rate)

        data += data_part

    return data





data = read_data(dataroot, sampling_rate=10)



arr = np.array(data)

# X contains feature values and Y the output values.

X = arr[:, :-1].astype(np.float32)  # ERROR HERE.

Y_str = arr[:, -1]

I get the error as follows:-

File "/some_path/temp.py", line 63, in <module>

    X = arr[:, :-1].astype(np.float32)



IndexError: too many indices for array

python numpy slice data-processing

asked Nov 11 at 10:32

venom8914

178219

asked Nov 11 at 10:32

venom8914

178219

asked Nov 11 at 10:32

venom8914

178219

asked Nov 11 at 10:32

venom8914

178219

asked Nov 11 at 10:32

venom8914

178219

That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08

You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

add a comment |

That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08

You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

That you arr[:, :-1] have shape (3,) (arr[:, :-1].shape). You don't have two demensional. You can only take arr[:-1].
– Rudolf Morkovskyi
Nov 11 at 12:08

You are assuming arr is a 2d array. Don't assume. Verify. Check shape, and dtype. if 1d object dtype, check elements.
– hpaulj
Nov 11 at 13:12

add a comment |

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53247863%2ftoo-many-indices-error-in-my-already-micro-dataset%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Stack Overflow!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Wsrtjtyk