Ошибка single positional indexer is out of bounds

Есть датафрейм:

import pandas as pd

d = {'Id':[14038.0, 15053.0, 4765.0, 10783.0, 12915.0,5809.0, 11993.0, 5172.0, 10953.0, 11935.0,7917.0],
        'Square':[48.0, 65.7, 44.9, 39.6, 80.4,53.4, 80.3, 64.5, 53.8, 64.7, 212.9],
        'LifeSquare':[29.4, 40.0, 29.2, 23.8, 46.7,52.7,0 ,0 , 52.4, 0, 211.2]}
df = pd.DataFrame(d)

Задача — Скорректировать параметр LifeSquare перед обучением модели.
Написал функцию для отбора ближайших подобных чисел:

def square_correction(data):
    item = 'LifeSquare'
    valid = data.loc[~((data[item] > data['Square'] * 0.8) |
                       (data[item] < data['Square'] * 0.3)|
                       (data[item]).isna())]
    invalid = data.loc[(data[item] > data['Square'] * 0.8) |
                       (data[item] < data['Square'] * 0.3)|
                       (data[item]).isna()]

    best_feature, item_by_best_feature = best_params(valid, item)

    for i in range(0, len(invalid[item])):
        flat_id = invalid[item].index[i]
        best_feature_meaning = invalid[best_feature][flat_id]


        bigger = valid.loc[(valid[best_feature] >= best_feature_meaning)].reset_index().iloc[0]
        smoller = valid.loc[(valid[best_feature] <= best_feature_meaning)].reset_index().iloc[-1]

        difference_up  = (bigger[best_feature] - data[best_feature][flat_id])
        difference_down = (data[best_feature][flat_id] - smoller[best_feature])

        text = f'flat id:{flat_id}. {item} was changed. {i+1} of {len(invalid[item])} done.'
        if  difference_up ==  difference_down:
            print(text)
            data[item][flat_id] = item_by_best_feature[best_feature_meaning]
        elif not difference_up >=  difference_down:
            print(text)
            data[item][flat_id] = bigger[item]
        else:
            print(text)
            data[item][flat_id] = smoller[item]    


    print(f'best feature: {best_feature}. {len(invalid)} rows was changed.')
    return data

запускаем функцию:

df = square_correction(df)

Всё идёт нормально до последней строчки, где jupyter notebook выдает ошибку:

IndexError: single positional indexer is out-of-bounds

Почему ему одно наблюдение из всех так не нравится?

P.S. На учебном датафрейме (10000 наблюдений) выдаёт ту же ошибку:

IndexError                                Traceback (most recent call last)
<ipython-input-17-d4ceb1216100> in <module>
----> 1 data = square_correction(data)

<ipython-input-16-c8f2bf3d18d3> in square_correction(data)
     20 
     21 
---> 22         bigger = valid.loc[(valid[best_feature] >= best_feature_meaning)].reset_index().iloc[0]
     23         smoller = valid.loc[(valid[best_feature] <= best_feature_meaning)].reset_index().iloc[-1]
     24 

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in __getitem__(self, key)
   1498 
   1499             maybe_callable = com.apply_if_callable(key, self.obj)
-> 1500             return self._getitem_axis(maybe_callable, axis=axis)
   1501 
   1502     def _is_scalar_access(self, key):

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _getitem_axis(self, key, axis)
   2228 
   2229             # validate the location
-> 2230             self._validate_integer(key, axis)
   2231 
   2232             return self._get_loc(key, axis=axis)

~/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py in _validate_integer(self, key, axis)
   2137         len_axis = len(self.obj._get_axis(axis))
   2138         if key >= len_axis or key < -len_axis:
-> 2139             raise IndexError("single positional indexer is out-of-bounds")
   2140 
   2141     def _getitem_tuple(self, tup):

IndexError: single positional indexer is out-of-bounds

Indexing is an essential tool for storing and handling large and complex datasets with rows and columns. In Python, we use index values within square brackets to perform the indexing. If we try to access an index beyond the dimensions of the dataset, we will raise the error: IndexError: single positional indexer is out-of-bounds.

This tutorial will go through the error in detail, and we will go through an example scenario to learn how to solve the error.


Table of contents

  • IndexError: single positional indexer is out-of-bounds
    • What is an IndexError?
    • What is a DataFrame?
    • What is iloc()?
  • Example : Accessing a Column That Does Not Exist
    • Solution
  • Summary

IndexError: single positional indexer is out-of-bounds

What is an IndexError?

Python’s IndexError occurs when the index specified does not lie in the range of indices in the bounds of an array. In Python, index numbers start from 0. Let’s look at an example of a typical Python array:

animals = ["lion", "sheep", "whale"]

This array contains three values, and the first element, lion, has an index value of 0. The second element, sheep, has an index value of 1. The third element, whale, has an index value of 2.

If we try to access an item at index position 3, we will raise an IndexError.

print(animals[3])
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
1 print(animals[3])

IndexError: list index out of range

What is a DataFrame?

A DataFrame is a data structure that organizes data into a 2-dimensional table of rows and columns. The Python module Pandas works with DataFrames.

What is iloc()?

Pandas offers large-scale data analysis functions like the iloc() function, which enables us to select particular rows, columns, or individual cells of a dataset. The iloc() function performs integer-based indexing for selection by position. iloc() will raise “IndexError: single positional indexer is out-of-bounds” if a requested index is out-of-bounds. However, this error will not occur if you use a slice index, for example,

array[:slice_index]

Slice indexing allows for out-of-bounds indexing, which conforms with Python/numpy slice semantics. Let’s look at an example of the IndexError.

Example : Accessing a Column That Does Not Exist

Let’s create a DataFrame and attempt to access a particular column in the DataFrame. The dataset will contain a list of five car owners and will store each car owner’s city of residence and the brand of car they own. First, we must import Pandas and then define the rows that comprise our DataFrame. One row will store names, one will store cities, and one will store cars.

import pandas as pd
df = pd.DataFrame({'Name': ['Jim', 'Lisa', 'Paul', 'Carol', 'Biff'],

                    'City': ['Lisbon', 'Palermo', 'Sofia', 'Munich', 'Bangkok'],

                    'Car': ['Mercedes', 'Bentley', 'Ferrari', 'Rolls Royce', 'Aston Martin']})


if we print the DataFrame to the console, we will get the following arrangement of data in three rows and five columns.

print(df)
  Name     City           Car
0    Jim   Lisbon      Mercedes
1   Lisa  Palermo       Bentley
2   Paul    Sofia       Ferrari
3  Carol   Munich   Rolls Royce
4   Biff  Bangkok  Aston Martin

Let’s try to access the fifth column of the dataset using iloc(). In this example, it looks like:

print(df.iloc[:,5])
IndexError: single positional indexer is out-of-bounds

We raise the IndexError because we tried to access the fifth column of the dataset, and the fifth column does not exist for this particular dataset.

Solution

To solve this error, we can start by getting the shape of the dataset:

print(df.shape)
(5, 3)

This result tells us that the dataset has five rows and three columns, which means we can only use column index up to 2. Let’s try to take the car column with index 2.

print(df.iloc[:,2])
0        Mercedes
1         Bentley
2         Ferrari
3     Rolls Royce
4    Aston Martin
Name: Car, dtype: object

The code runs, and we can extract the car column from the dataset and print it to the console.

We can also access one particular value in the dataset by using two separate pairs of square brackets, one for the row and one for the column. Let’s try to get the car that Jim from Lisbon owns:

# Get particular value in row

jim_car = df.iloc[0][2]

print(jim_car)
Mercedes

The code runs and prints the value specific to row 0 column 2.

We can take a dataset slice using a colon followed by a comma then the slice. Let’s look at an example of slicing the first two columns of the car dataset:

print(df.iloc[:, 0:2])
  Name     City
0    Jim   Lisbon
1   Lisa  Palermo
2   Paul    Sofia
3  Carol   Munich
4   Biff  Bangko

We can also use slice indices out of the bound of the dataset; let’s use slicing to get five columns of the dataset

print(df.iloc[:, 0:5])
    Name     City           Car
0    Jim   Lisbon      Mercedes
1   Lisa  Palermo       Bentley
2   Paul    Sofia       Ferrari
3  Carol   Munich   Rolls Royce
4   Biff  Bangkok  Aston Martin

Although the dataset only has three columns, we can use slice indexing for five because slice indexers allow out-of-bounds indexing. Therefore we will not raise the IndexError: single positional indexer is out-of-bounds. Go to the article titled: “How to Get a Substring From a String in Python“.

Summary

Congratulations on reading to the end of this tutorial! The error “Indexerror: single positional indexer is out-of-bounds” occurs when you try to access a row/column with an index value out of the bounds of the pandas DataFrame. To solve this error, you must use index values within the dimensions of the dataset. You can get the dimensionality of a dataset using shape. Once you know the correct index values, you can get specific values using the iloc() function, which does integer-location based indexing.

It is important to note that using a slice with integers in the iloc() function will not raise the IndexError because slice indexers allow out-of-bounds indexing.

For further reading on Python IndexError, go to the articles:

  • How to Solve Python IndexError: list index out of range
  • How to Solve Python IndexError: too many indices for array

To learn more about Python for data science and machine learning, go to the online courses page on Python for the most comprehensive courses available.

Indexing in large and complex data sets plays a critical role in storing and handling data. When we deal with compound data types like lists and tuples or data sets having rows and columns in data science, we frequently use index values within square brackets to use them. In this article, we will talk about the index-based error: single positional indexer is out-of-bounds.

What is this “Indexerror: single positional indexer is out-of-bounds” error?

This is an index-based error that pops up when programmers try to access or call or use any memory that is beyond the scope of the index. Let suppose, you have a list that has five elements. This means, your index will start from 0 up till 4. But now, if you try to access or display or change the value of the 7th index, will it be possible? No, because your index range lies within 0 and 4. This is what we called bound. But, accessing elements exceeding the bound is what the Python interpreter calls an out-of-bounds situation.

indexerror: Single positional indexer is out-of-bounds Error

Indexerror in case of dataset accessing:

Let suppose, you have a dataset Y = Dataset.iloc[:,18].values

In this case, if you are experiencing “Indexing is out of bounds” error, then most probably this is because there are less than 18 columns in your dataset, and you are trying to access something that does not exists. So, column 18 or less does not exist.

Indexerror in case of unknown DataFrame Size:

Such an error also occurs when you have to index a row or a column having a number greater than the dimensions of your DataFrame. For example, if you try to fetch the 7th column from your DataFrame when you have only three columns defined like this.

Error Code:

import pandas as pd

df = pd.DataFrame({'Name': ['Karl', 'Ray', 'Gaurav', 'Dee', 'Sue'],

                   'City': ['London', 'Montreal', 'Delhi', 'New York', 'Glasgow'],

                   'Car': ['Maruti', 'Audi', 'Ferrari', 'Rolls Royce', ' Tesla'] })

print(df)

x = df.iloc[0, 8]

print(x)

Output:

raise IndexError("single positional indexer is out-of-bounds")

IndexError: single positional indexer is out-of-bounds

This program creates an error because the second size attribute () we want to fetch does not exist.

This also happens if the programmer misunderstood the iloc() function. The iloc() is used to select a particular cell of the dataset or data in a tabular format. Any data that belongs to a particular row or column from a set of values within a dataframe or dataset.

In this function, the value before the comma(,) defines the index of rows & the after ‘,’ represents the index of columns. But if your data does not lie within the range, then iloc() won’t be able to fetch any data and hence will show this error.

Correct code:

import pandas as pd

df = pd.DataFrame({'Name': ['Karl', 'Ray', 'Gaurav', 'Dee', 'Sue'],

                   'City': ['London', 'Montreal', 'Delhi', 'New York', 'Glasgow'],

                   'Car': ['Maruti', 'Audi', 'Ferrari', 'Rolls Royce', ' Tesla'] })

print(df)

x = df.iloc[3, 0]

print("n Fetched value using the iloc() is: ", x)

Output:

     Name      City          Car

0    Karl    London       Maruti

1     Ray  Montreal         Audi

2  Gaurav     Delhi      Ferrari

3     Dee  New York  Rolls Royce

4     Sue   Glasgow        Tesla

Fetched value using the iloc() is:  Dee

Explanation:

First we create the DataFrame (2-D dataset) with three columns and five rows and print it. Here we have mentioned the exact row and column value for which we are not receiving any error. Therefore, to resolve such “indexerror single positional indexer is out-of-bounds” error, we have to first check the outer bound of the rows and columns existing in our dataset.

Conclusion:

To eliminate such error messages and not to encounter such errors repeatedly, programmers need to focus on the retrieval of particular count of row and columns. Also, programmers should focus on checking the valid range of index values. Also it is easy and comfortable to use «iloc()» for retrieving any value a programmer wants. But the programmer needs to make sure that they refer to the correct index values, otherwise, “Indexerror: single positional indexer is out-of-bounds” error will pop up.

In Python, an IndexError occurs when you try to access an index that is outside the valid index range of a data structure like a list, tuple, or dataframe. This error can be frustrating, especially when you are working with large datasets. In this tutorial, we will discuss how to fix the “IndexError: single positional indexer is out-of-bounds” error that occurs when you try to access an index outside the valid index range in a dataframe in Python.

fix indexerror single positional indexer is out of bounds error

Understanding the Error

Before we dive into the solution, let’s first understand the error message. The “IndexError: single positional indexer is out-of-bounds” error occurs when you try to access an index that is outside the valid index range of a dataframe. For example, if you have a dataframe with 5 rows and you try to access the 6th row, you will get this error.

Let’s reproduce this error in an example.

import pandas as pd

# create a pandas dataframe
df = pd.DataFrame({
    'Name': ['Jim', 'Dwight', 'Oscar', 'Tobi', 'Angela'],
    'Age': [26, 30, 28, 38, 31],
    'Department': ['Sales', 'Sales', 'Accounting', 'HR', 'Accounting']
})

# try to access the 6th row, row at index 5
print(df.iloc[5])

Output:

---------------------------------------------------------------------------

IndexError                                Traceback (most recent call last)

Cell In[9], line 11
      4 df = pd.DataFrame({
      5     'Name': ['Jim', 'Dwight', 'Oscar', 'Tobi', 'Angela'],
      6     'Age': [26, 30, 28, 38, 31],
      7     'Department': ['Sales', 'Sales', 'Accounting', 'HR', 'Accounting']
      8 })
     10 # try to access the 6th row, row at index 5
---> 11 print(df.iloc[5])

File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/indexing.py:931, in _LocationIndexer.__getitem__(self, key)
    928 axis = self.axis or 0
    930 maybe_callable = com.apply_if_callable(key, self.obj)
--> 931 return self._getitem_axis(maybe_callable, axis=axis)

File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/indexing.py:1566, in _iLocIndexer._getitem_axis(self, key, axis)
   1563     raise TypeError("Cannot index by location index with a non-integer key")
   1565 # validate the location
-> 1566 self._validate_integer(key, axis)
   1568 return self.obj._ixs(key, axis=axis)

File ~/miniforge3/envs/dsp/lib/python3.8/site-packages/pandas/core/indexing.py:1500, in _iLocIndexer._validate_integer(self, key, axis)
   1498 len_axis = len(self.obj._get_axis(axis))
   1499 if key >= len_axis or key < -len_axis:
-> 1500     raise IndexError("single positional indexer is out-of-bounds")

IndexError: single positional indexer is out-of-bounds

We get the IndexError: single positional indexer is out-of-bounds error.

Fixing the error

To fix the “IndexError: single positional indexer is out-of-bounds” error, you need to make sure that you are accessing a valid index in the dataframe. Here are some ways to do that:

1) Use an index within the index range

If the index that you’re trying to use lies within the index range (that is, it’s a valid index in the dataframe), you’ll not get this error. For example, in the above dataframe, if we use the index 4, representing the row 5, we’ll not get an error.

# try to access the 5th row, row at index 4
print(df.iloc[4])

Output:

Name              Angela
Age                   31
Department    Accounting
Name: 4, dtype: object

But we cannot always know beforehand whether an index is a valid index or not.

2) Check if the index is within the valid range using If statement

One way to avoid this error is to use conditional statements to check if the index is within the valid range before accessing it. Here’s an example:

# try to access the 6th row, row at index 5
index = 5
if index < len(df):
    print(df.iloc[index])
else:
    print("Index out of range")

Output:

Index out of range

In the above example, we first check if the row index we’re trying to access is less than the dataframe’s length. If it is, we access the row at the given index using the iloc function. If it’s not, we print a message saying that the index is out of range.

3) Using try-except

Alternatively, you can also use exception handling to handle this error.

# try to access the 6th row, row at index 5
try:
    index = 5
    print(df.iloc[index])
except IndexError:
    print("Index out of range")

Output:

Index out of range

Conclusion

The “IndexError: single positional indexer is out-of-bounds” error occurs when you try to access an index outside the valid index range in a dataframe in Python. To fix the error, you need to make sure that you are accessing a valid index in the dataframe. You can check the index range, use conditional statements, or error handling to avoid this error.

You might also be interested in –

  • Understand and Fix IndexError in Python
  • How to Fix – IndexError list assignment index out of range
  • Pandas – Get Rows by their Index and Labels
  • Piyush Raj

    Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

    View all posts

Often you will get an error IndexError: single positional indexer is out-of-bounds that is referencing a row that does not exist based on its index value.

When you want to look at a particular row in Python, there is a way that you can reference the row and then the values within it.

Lets break it down further to understand how the error occurs and why and how to fix it.

How the error occurs?

When we look at the below code, it throws out the error we are trying to fix.

Digging deeper lets look at the file we are importing, and the values contained within them. From the CSV file:

Piyush is a data professional passionate about using data to understand things better and make informed decisions. He has experience working as a Data Scientist in the consulting domain and holds an engineering degree from IIT Roorkee. His hobbies include watching cricket, reading, and working on side projects.

View all posts

Often you will get an error IndexError: single positional indexer is out-of-bounds that is referencing a row that does not exist based on its index value.

When you want to look at a particular row in Python, there is a way that you can reference the row and then the values within it.

Lets break it down further to understand how the error occurs and why and how to fix it.

How the error occurs?

When we look at the below code, it throws out the error we are trying to fix.

Digging deeper lets look at the file we are importing, and the values contained within them. From the CSV file:

the above values are imported. If we were to create a matrix of its index values, it would be as follows:

As can be seen already, the index values range from zero to four in both row values and the column values are an index value of 1.

In the below code though we are trying to reference a row index value of five, but that does not exist, hence the error.

Note that using “iloc” allows you to return all the row values or a particular row and column value, we will demonstrate that in the next section.

import pandas as pd
dataset = pd.read_csv('import_file.csv', sep = ',')
df = pd.DataFrame(dataset, columns=['Name','Age'])
a = df.iloc[5] #===>this allows you to print a particular row or value in that row
print(df)
Error:
IndexError: single positional indexer is out-of-bounds

How to fix this error?

First off let’s just return the whole row say of index value two based on the below matrix:

This should return Jim and 23 in the output

import pandas as pd
dataset = pd.read_csv('import_file.csv', sep = ',')
df = pd.DataFrame(dataset, columns=['Name','Age'])
a = df.iloc[2] #===>this allows you to print a particular row or value in that row
print(df)
print(a)
Output:
       Name  Age
0       Joe   21
1      John   22
2       Jim   23
3      Jane   24
4  Jennifer   25
Name    Jim
Age      23
Name: 2, dtype: object
Process finished with exit code -1073741819 (0xC0000005)

We could also return either a name or age value as well, as long as they are within the range of values. This is achieved as follows:

Lets return just Jennifer’s age of 25 as follows:

import pandas as pd
dataset = pd.read_csv('import_file.csv', sep = ',')
df = pd.DataFrame(dataset, columns=['Name','Age'])
a = df.iloc[4][1] #===>this allows you to print a particular row or value in that row
print(df)
print(a)
Output:
      Name  Age
0       Joe   21
1      John   22
2       Jim   23
3      Jane   24
4  Jennifer   25
25
Process finished with exit code -1073741819 (0xC0000005)

So in summary:

(A) When you are looking to retrieve particular values in a row, you need to make sure you have a valid range of index values.

(B) Using “iloc” is a handy way to retrieve any value you want, but make sure you reference the correct index values.

Понравилась статья? Поделить с друзьями:
  • Ошибка sim карты что это значит
  • Ошибка sim карты обратитесь к оператору
  • Ошибка sim карты на айфоне
  • Ошибка sim карты или sim карта не установлена
  • Ошибка sim карты в i phone