Pandas: KeyError when using column names which are included in an index

I have text files that I'm parsing which contain fixed width fields with lines that look like this:

USC00142401201703TMAX  211  H  133  H  161  H  194  H  206  H  161  H  244  H  178  H-9999     250  H   78  H   44  H   67  H   50  H   39  H  106  H  239  H  239  H  217  H  317  H  311  H  178  H  139  H-9999     228  H-9999   -9999   -9999   -9999   -9999   -9999

I'm parsing these into a pandas DataFrame like so:

from collections import OrderedDictfrom pandas import DataFrameimport pandas as pdimport numpy as npdef read_into_dataframe(station_filepath):    # specify the fixed-width fields    column_specs = [(0, 11),   # ID                    (11, 15),  # year                    (15, 17),  # month                    (17, 21),  # variable (referred to as element in the GHCND readme.txt)                    (21, 26),  # day 1                    (29, 34),  # day 2                    (37, 42),  # day 3                    (45, 50),  # day 4                    (53, 58),  # day 5                    (61, 66),  # day 6                    (69, 74),  # day 7                    (77, 82),  # day 8                    (85, 90),  # day 9                    (93, 98),  # day 10                    (101, 106),  # day 11                    (109, 114),  # day 12                    (117, 122),  # day 13                    (125, 130),  # day 14                    (133, 138),  # day 15                    (141, 146),  # day 16                    (149, 154),  # day 17                    (157, 162),  # day 18                    (165, 170),  # day 19                    (173, 178),  # day 20                    (181, 186),  # day 21                    (189, 194),  # day 22                    (197, 202),  # day 23                    (205, 210),  # day 24                    (213, 218),  # day 25                    (221, 226),  # day 26                    (229, 234),  # day 27                    (237, 242),  # day 28                    (245, 250),  # day 29                    (253, 258),  # day 30                    (261, 266)]  # day 31    # create column names to correspond with the fields specified above    column_names = ['station_id', 'year', 'month', 'variable','01', '02', '03', '04', '05', '06', '07', '08', '09', '10',  '11', '12', '13', '14', '15', '16', '17', '18', '19', '20',  '21', '22', '23', '24', '25', '26', '27', '28', '29', '30',  '31']    # read the fixed width file into a DataFrame columns with the widths and names specified above    df = pd.read_fwf(station_filepath,                      header=None,                     colspecs=column_specs,                     names=column_names,                     na_values=-9999)    # convert the variable column to string data type, all others as integer data type    df.dropna()  #REVISIT do we really want to do this?    df['variable'] = df['variable'].astype(str)    # keep only the rows where the variable value is 'PRCP', 'TMIN', or 'TMAX'    df = df[df['variable'].isin(['PRCP', 'TMAX', 'TMIN'])]    # melt the individual day columns into a single day column    df = pd.melt(df,                 id_vars=['station_id', 'year', 'month', 'variable'],                 value_vars=['01', '02', '03', '04', '05', '06', '07', '08', '09', '10','11', '12', '13', '14', '15', '16', '17', '18', '19', '20','21', '22', '23', '24', '25', '26', '27', '28', '29', '30', '31'],                 var_name='day',                  value_name='value')    # pivot the DataFrame on the variable type (PRCP, TMIN, TMAX), so each    # type has a separate column with the day's value for the type    df = df.pivot_table(index=['year','month','day'],                        columns='variable',                        values='value')    return df

I now get the DataFrame in the shape I want it, except that there are rows for days that don't exist (i.e. February 31st, etc.), and which I'd like to remove.

I've tried to do this using masks, but when I've done so I get a KeyError when I try to use what I think are valid column names. For example if I include the following code in the above function before returning the DataFrame I will get a KeyError:

months_with_31days = [1, 3, 7, 8, 10, 12]df = df[((df['day'] == 31) & (df['month'] in months_with_31days))        |       ((df['day'] == 30) & (df['month'] != 2))        |       ((df['day'] == 29) & (df['month'] != 2))        |       ((df['day'] == 29) & (df['month'] == 2) & calendar.isleap(df['year']))        |         df['day'] < 29]

The above will result in a KeyError:

KeyError: 'day'

The day variable was created by the melt() call, then used within the index in the call to pivot_table(). How this affects the indexing of the DataFrame and why it messes up the ability to use the previous column names is not clear to me.

Edit

I assume that I now have a MultiIndex on the DatFrame, created as a result of the call to pivot_table() via using an index argument.

Initial lines displayed when printing the DataFrame:

variable         PRCP   TMAX   TMINyear month day                     1893 1     01     NaN   61.0   33.0           02     NaN   33.0    6.0           03     NaN   44.0   17.0           04     NaN   78.0   22.0           05     NaN   17.0  -94.0           06     NaN   33.0    0.0           07     NaN    0.0  -67.0

I've tried referencing the DataFrame's columns using dot notation instead of brackets with quoted column names, but I get similar errors. It seems like the year, month, and day columns have been merged into a single index column and can no longer be referenced individually. Or maybe something else is going on here? I'm stumped, and wonder if there is a better way to do it.

Pandas: KeyError when using column names which are included in an index

Edit

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112