I am currently trying to parse some data from bash tables, and I found strange behavior in parsing data if some columns is empty for example
i have data like this
containerName ipAddress memoryMB name numberOfCpus status
--------------- --------------- ---------- ------- -------------- ----------
TEST_VM 192.168.150.111 8192 TEST_VM 4 POWERED_ON
and sometimes like this
containerName ipAddress memoryMB name numberOfCpus status
--------------- ----------- ---------- ---------------------- -------------- -----------
TEST_VM_second 3072 TEST_VM_second_renamed 1 POWERED_OFF
I tried with python and with bash, but same results, I need data "name" but when I am using bash
for example awk ‘{print $4}’ in first table it prints expected result:
name
-------
TEST_VM
but in second table in prints:
name
----------------------
1
same results with python:
df_info = pd.read_table(StringIO(table), delim_whitespace=True)
df_info = df_info.drop(0)
pd.set_option('display.max_colwidth', None)
print(df_info['name'], df_info['containerName'])
Output:
1 TEST_VM
Name: name, dtype: object 1 TEST_VM
Name: containerName, dtype: object
1 1
Name: name, dtype: object 1 TEST_VM_second
Name: containerName, dtype: object
Maybe someone knows how to play around if ipaddress is empty field ?
>Solution :
Don’t parse the file manually, take advantage of pandas.read_fwf:
df_info = pd.read_fwf(StringIO(table), skiprows=[1], delim_whitespace=True)
df_info.columns = df_info.columns.str.strip()
Output:
containerName ipAddre memoryMB name numberOfCpu tatu
0 TEST_VM_second 3072 TEST_VM_second_renamed 1 POWERED_OFF