Skip to main content

Questions tagged [pandas]

Pandas is a Python library for data manipulation and analysis, e.g. dataframes, multidimensional time series and cross-sectional datasets commonly found in statistics, experimental science results, econometrics, or finance. Pandas is one of the main data science libraries in Python.

Filter by
Sorted by
Tagged with
4119 votes
34 answers
7.5m views

How can I iterate over rows in a Pandas DataFrame?

I have a pandas dataframe, df: c1 c2 0 10 100 1 11 110 2 12 120 How do I iterate over the rows of this dataframe? For every row, I want to access its elements (values in cells) by the name ...
Roman's user avatar
  • 129k
3541 votes
18 answers
6.5m views

How do I select rows from a DataFrame based on column values?

How can I select rows from a DataFrame based on values in some column in Pandas? In SQL, I would use: SELECT * FROM table WHERE column_name = some_value
szli's user avatar
  • 38.4k
2993 votes
33 answers
6.6m views

Renaming column names in Pandas

I want to change the column labels of a Pandas DataFrame from ['$a', '$b', '$c', '$d', '$e'] to ['a', 'b', 'c', 'd', 'e']
user1504276's user avatar
2264 votes
22 answers
4.3m views

Delete a column from a Pandas DataFrame

To delete a column in a DataFrame, I can successfully use: del df['column_name'] But why can't I use the following? del df.column_name Since it is possible to access the Series via df.column_name, I ...
John's user avatar
  • 42.6k
1948 votes
19 answers
4.4m views

How do I get the row count of a Pandas DataFrame?

How do I get the number of rows of a pandas dataframe df?
yemu's user avatar
  • 27.7k
1761 votes
22 answers
4.1m views

Selecting multiple columns in a Pandas dataframe

How do I select columns a and b from df, and save them into a new dataframe df1? index a b c 1 2 3 4 2 3 4 5 Unsuccessful attempt: df1 = df['a':'b'] df1 = df.ix[:, 'a':'b']
user1234440's user avatar
  • 23.3k
1647 votes
43 answers
2.5m views

How to change the order of DataFrame columns?

I have the following DataFrame (df): import numpy as np import pandas as pd df = pd.DataFrame(np.random.rand(10, 5)) I add more column(s) by assignment: df['mean'] = df.mean(1) How can I move the ...
Timmie's user avatar
  • 16.7k
1540 votes
17 answers
4.0m views

Change column type in pandas

I created a DataFrame from a list of lists: table = [ ['a', '1.2', '4.2' ], ['b', '70', '0.03'], ['x', '5', '0' ], ] df = pd.DataFrame(table) How do I convert the columns to ...
user avatar
1452 votes
17 answers
2.2m views

How to drop rows of Pandas DataFrame whose value in a certain column is NaN

I have this DataFrame and want only the records whose EPS column is not NaN: STK_ID EPS cash STK_ID RPT_Date 601166 20111231 601166 NaN NaN 600036 20111231 ...
bigbug's user avatar
  • 58.3k
1435 votes
26 answers
2.4m views

How to deal with SettingWithCopyWarning in Pandas

Background I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this: E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value ...
bigbug's user avatar
  • 58.3k
1405 votes
32 answers
2.3m views

Create a Pandas Dataframe by appending one row at a time [duplicate]

How do I create an empty DataFrame, then add rows, one by one? I created an empty DataFrame: df = pd.DataFrame(columns=('lib', 'qty1', 'qty2')) Then I can add a new row at the end and fill a single ...
PhE's user avatar
  • 16.4k
1362 votes
25 answers
2.2m views

Get a list from Pandas DataFrame column headers

I want to get a list of the column headers from a Pandas DataFrame. The DataFrame will come from user input, so I won't know how many columns there will be or what they will be called. For example, ...
natsuki_2002's user avatar
1356 votes
8 answers
1.7m views

Use a list of values to select rows from a Pandas dataframe

Let’s say I have the following Pandas dataframe: df = DataFrame({'A': [5,6,3,4], 'B': [1,2,3,5]}) df A B 0 5 1 1 6 2 2 3 3 3 4 5 I can subset based on a specific value: x =...
zach's user avatar
  • 30.6k
1324 votes
33 answers
2.7m views

How to add a new column to an existing DataFrame

I have the following indexed DataFrame with named columns and rows not- continuous numbers: a b c d 2 0.671399 0.101208 -0.181532 0.241273 3 0.446172 -0.243316 0....
tomasz74's user avatar
  • 16.5k
1213 votes
14 answers
1.5m views

Pretty-print an entire Pandas Series / DataFrame

I work with Series and DataFrames on the terminal a lot. The default __repr__ for a Series returns a reduced sample, with some head and tail values, but the rest missing. Is there a builtin way to ...
Dun Peal's user avatar
  • 17.3k
1191 votes
8 answers
951k views

Convert list of dictionaries to a pandas DataFrame

How can I convert a list of dictionaries into a DataFrame? I want to turn [{'points': 50, 'time': '5:00', 'year': 2010}, {'points': 25, 'time': '6:00', 'month': "february"}, {'points':90,...
appleLover's user avatar
  • 15.4k
1184 votes
16 answers
354k views

"Large data" workflows using pandas [closed]

I have tried to puzzle out an answer to this question for many months while learning pandas. I use SAS for my day-to-day work and it is great for it's out-of-core support. However, SAS is horrible ...
Zelazny7's user avatar
  • 40.5k
1129 votes
10 answers
2.6m views

Writing a pandas DataFrame to CSV file

I have a dataframe in pandas which I would like to write to a CSV file. I am doing this using: df.to_csv('out.csv') And getting the following error: UnicodeEncodeError: 'ascii' codec can't encode ...
user7289's user avatar
  • 33.7k
1034 votes
20 answers
2.0m views

Deleting DataFrame row in Pandas based on column value

I have the following DataFrame: daysago line_race rating rw wrating line_date 2007-03-31 62 11 56 1.000000 ...
TravisVOX's user avatar
  • 21.4k
1016 votes
23 answers
1.7m views

How do I expand the output display to see more columns of a Pandas DataFrame?

Is there a way to widen the display of output in either interactive or script-execution mode? Specifically, I am using the describe() function on a Pandas DataFrame. When the DataFrame is five ...
beets's user avatar
  • 10.7k
977 votes
22 answers
1.9m views

Combine two columns of text in pandas dataframe

I have a dataframe that looks like Year quarter 2000 q2 2001 q3 How do I add a new column by combining these columns to get the following dataframe? Year quarter period 2000 q2 ...
user2866103's user avatar
  • 10.3k
974 votes
7 answers
823k views

How are iloc and loc different?

Can someone explain how these two methods of slicing are different? I've seen the docs and I've seen previous similar questions (1, 2), but I still find myself unable to understand how they are ...
AZhao's user avatar
  • 14.2k
931 votes
8 answers
436k views

Pandas Merging 101

How can I perform a (INNER| (LEFT|RIGHT|FULL) OUTER) JOIN with pandas? How do I add NaNs for missing rows after a merge? How do I get rid of NaNs after merging? Can I merge on the index? How do I ...
cs95's user avatar
  • 396k
900 votes
18 answers
1.5m views

Filter pandas DataFrame by substring criteria

I have a pandas DataFrame with a column of string values. I need to select rows based on partial string matches. Something like this idiom: re.search(pattern, cell_in_question) returning a boolean. ...
euforia's user avatar
  • 9,085
886 votes
8 answers
2.4m views

Creating an empty Pandas DataFrame, and then filling it

I'm starting from the pandas DataFrame documentation here: Introduction to data structures I'd like to iteratively fill the DataFrame with values in a time series kind of calculation. I'd like to ...
Matthias Kauer's user avatar
874 votes
12 answers
1.3m views

How to filter Pandas dataframe using 'in' and 'not in' like in SQL

How can I achieve the equivalents of SQL's IN and NOT IN? I have a list with the required values. Here's the scenario: df = pd.DataFrame({'country': ['US', 'UK', 'Germany', 'China']}) ...
LondonRob's user avatar
  • 77k
869 votes
15 answers
912k views

Shuffle DataFrame rows

I have the following DataFrame: Col1 Col2 Col3 Type 0 1 2 3 1 1 4 5 6 1 ... 20 7 8 9 2 21 10 11 12 2 ... 45 13 14 15 ...
JNevens's user avatar
  • 11.7k
866 votes
15 answers
2.5m views

Truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I want to filter my dataframe with an or condition to keep rows with a particular column's values that are outside the range [-0.25, 0.25]. I tried: df = df[(df['col'] < -0.25) or (df['col'] > 0....
obabs's user avatar
  • 8,971
840 votes
10 answers
1.3m views

How to convert index of a pandas dataframe into a column

How to convert an index of a dataframe into a column? For example: gi ptt_loc 0 384444683 593 1 384444684 594 2 384444686 596 to index1 gi ptt_loc ...
msakya's user avatar
  • 9,651
833 votes
24 answers
1.5m views

Constructing DataFrame from values in variables yields "ValueError: If using all scalar values, you must pass an index"

I have two variables as follows. a = 2 b = 3 I want to construct a DataFrame from this: df2 = pd.DataFrame({'A':a, 'B':b}) This generates an error: ValueError: If using all scalar values, you must ...
Nilani Algiriyage's user avatar
808 votes
32 answers
1.5m views

How do I count the NaN values in a column in pandas DataFrame?

I want to find the number of NaN in each column of my data.
user3799307's user avatar
  • 8,089
804 votes
12 answers
1.7m views

Get statistics for each group (such as count, mean, etc) using pandas GroupBy?

I have a dataframe df and I use several columns from it to groupby: df['col1','col2','col3','col4'].groupby(['col1','col2']).mean() In the above way, I almost get the table (dataframe) that I need. ...
Roman's user avatar
  • 129k
795 votes
25 answers
1.8m views

Set value for particular cell in pandas DataFrame using index

I have created a Pandas DataFrame df = DataFrame(index=['A','B','C'], columns=['x','y']) Now, I would like to assign a value to particular cell, for example to row C and column x. In other words, I ...
Mitkp's user avatar
  • 8,029
768 votes
20 answers
986k views

Import multiple CSV files into pandas and concatenate into one DataFrame

I would like to read several CSV files from a directory into pandas and concatenate them into one big DataFrame. I have not been able to figure it out though. Here is what I have so far: import glob ...
jonas's user avatar
  • 13.8k
750 votes
16 answers
1.2m views

How to apply a function to two columns of Pandas dataframe

Suppose I have a function and a dataframe defined as below: def get_sublist(sta, end): return mylist[sta:end+1] df = pd.DataFrame({'ID':['1','2','3'], 'col_1': [0,2,3], 'col_2':[1,4,5]}) mylist = ...
bigbug's user avatar
  • 58.3k
750 votes
6 answers
840k views

How to avoid pandas creating an index in a saved csv

I am trying to save a csv to a folder after making some edits to the file. Every time I use pd.to_csv('C:/Path of file.csv') the csv file has a separate column of indexes. I want to avoid printing ...
Alexis's user avatar
  • 8,921
732 votes
12 answers
536k views

Difference between map, applymap and apply methods in Pandas

Can you tell me when to use these vectorization methods with basic examples? I see that map is a Series method whereas the rest are DataFrame methods. I got confused about apply and applymap methods ...
marillion's user avatar
  • 11k
729 votes
19 answers
2.6m views

How can I get a value from a cell of a dataframe?

I have constructed a condition that extracts exactly one row from my dataframe: d2 = df[(df['l_ext']==l_ext) & (df['item']==item) & (df['wn']==wn) & (df['wd']==1)] Now I would like to ...
Roman's user avatar
  • 129k
728 votes
30 answers
1.5m views

How to check if any value is NaN in a Pandas DataFrame

How do I check whether a pandas DataFrame has NaN values? I know about pd.isnan but it returns a DataFrame of booleans. I also found this post but it doesn't exactly answer my question either.
hlin117's user avatar
  • 21.8k
713 votes
27 answers
1.1m views

UnicodeDecodeError when reading CSV file in Pandas

I'm running a program which is processing 30,000 similar files. A random number of them are stopping and producing this error... File "C:\Importer\src\dfman\importer.py", line 26, in ...
TravisVOX's user avatar
  • 21.4k
707 votes
16 answers
1.7m views

Convert pandas dataframe to NumPy array

How do I convert a pandas dataframe into a NumPy array? DataFrame: import numpy as np import pandas as pd index = [1, 2, 3, 4, 5, 6, 7] a = [np.nan, np.nan, np.nan, 0.1, 0.1, 0.1, 0.1] b = [0.2, np....
Mister Nobody's user avatar
699 votes
50 answers
2.1m views

pandas.parser.CParserError: Error tokenizing data

I'm trying to use pandas to manipulate a .csv file but I get this error: pandas.parser.CParserError: Error tokenizing data. C error: Expected 2 fields in line 3, saw 12 I have tried to read the ...
abuteau's user avatar
  • 7,321
684 votes
6 answers
986k views

How do I check if a pandas DataFrame is empty?

How do I check if a pandas DataFrame is empty? I'd like to print some message in the terminal if the DataFrame is empty.
Nilani Algiriyage's user avatar
684 votes
13 answers
1.1m views

Converting a Pandas GroupBy multiindex output from Series back to DataFrame

I have a dataframe: City Name 0 Seattle Alice 1 Seattle Bob 2 Portland Mallory 3 Seattle Mallory 4 Seattle Bob 5 Portland Mallory I perform the following grouping: g1 ...
saveenr's user avatar
  • 8,549
683 votes
6 answers
1.7m views

How to delete rows from a pandas DataFrame based on a conditional expression [duplicate]

I have a pandas DataFrame and I want to delete rows from it where the length of the string in a particular column is greater than 2. I expect to be able to do this (per this answer): df[(len(df['...
sjs's user avatar
  • 9,180
676 votes
15 answers
1.3m views

How to sort pandas dataframe by one column

I have a dataframe like this: 0 1 2 0 354.7 April 4.0 1 55.4 August 8.0 2 176.5 December 12.0 3 95.5 February 2.0 4 85.6 January 1.0 5 ...
Sachila Ranawaka's user avatar
659 votes
16 answers
1.3m views

How to replace NaN values in a dataframe column

I have a Pandas Dataframe as below: itm Date Amount 67 420 2012-09-30 00:00:00 65211 68 421 2012-09-09 00:00:00 29424 69 421 2012-09-16 00:00:00 29877 70 421 ...
George Thompson's user avatar
632 votes
6 answers
689k views

How to check if a column exists in Pandas

How do I check if a column exists in a Pandas DataFrame df? A B C 0 3 40 100 1 6 30 200 How would I check if the column "A" exists in the above DataFrame so that I can compute:...
npires's user avatar
  • 6,503
631 votes
8 answers
1.3m views

Create new column based on values from other columns / apply a function of multiple columns, row-wise in Pandas

I want to apply my custom function (it uses an if-else ladder) to these six columns (ERI_Hispanic, ERI_AmerInd_AKNatv, ERI_Asian, ERI_Black_Afr.Amer, ERI_HI_PacIsl, ERI_White) in each row of my ...
Dave's user avatar
  • 7,280
628 votes
5 answers
75k views

How can I pivot a dataframe? [closed]

What is pivot? How do I pivot? Long format to wide format? I've seen a lot of questions that ask about pivot tables, even if they don't know it. It is virtually impossible to write a canonical ...
piRSquared's user avatar
  • 292k

1
2 3 4 5
5760