Home Pandas truncates strings in numpy list

Questions

Pandas truncates strings in numpy list

March 21, 2023

Consider the following minimal example:

@dataclass
class ExportEngine:

    def __post_init__(self):
        self.list = pandas.DataFrame(columns=list(MyObject.CSVHeaders()))

    def export(self):
        self.prepare()
        self.list.to_csv("~/Desktop/test.csv")

    def prepare(self):
        values = numpy.concatenate(
            (
                numpy.array(["Col1Value", "Col2Value", " Col3Value", "Col4Value"]),
                numpy.repeat("", 24),
            )
        )
        for x in range(8): #not the best way, but done due to other constraints
            start = 3 + (x * 3) - 2
            end = start + 3
            values[start:end] = [
                "123",
                "some_random_value_that_gets_truncated",
                "456",
            ]
        self.list.loc[len(self.list)] = values

When export() is called, some_random_value_that_gets_truncated is truncated to some_rando:

['Col1Value', '123', 'some_rando', '456', '123', 'some_rando', '456', '123', 'some_rando', '456', '123', 'some_rando', '456', '123', ...]

I’ve tried setting the following:

pandas.set_option("display.max_colwidth", 10000), but this doesn’t change anything…

Why does this happen, and how can I prevent the truncation?

>Solution :

So, numpy will by default choose a suitable, fixed-length unicode format.

Notice the dtype:

In [1]: import numpy

In [2]: values = numpy.concatenate(
   ...:     (
   ...:         numpy.array(["Col1Value", "Col2Value", " Col3Value", "Col4Value"]),
   ...:         numpy.repeat("", 24),
   ...:     )
   ...: )

In [3]: values
Out[3]:
array(['Col1Value', 'Col2Value', ' Col3Value', 'Col4Value', '', '', '',
       '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
       '', '', '', ''], dtype='<U10')

You should probably just not use numpy directly, but one quick fix is to replace:

values = numpy.concatenate(
    (
        numpy.array(["Col1Value", "Col2Value", " Col3Value", "Col4Value"]),
        numpy.repeat("", 24),
    )
)

with:

values = np.array(
    ['Col1Value', 'Col2Value', ' Col3Value', 'Col4Value', *[""]*24], 
    dtype=object
)

Notice the dtype=object, which will use just pointers to python str objects, so there won’t be a limitation on the length of the strings

truncation

byMR

Published March 21, 2023

Add a comment

convert a dictionary of pandas date frame into one excel file with different sheets

byMR

March 21, 2023

Questions

How do I wipe python completely and get a completely fresh installation?

byMR

March 21, 2023

Questions

filter objects with random number keys in Es6

byMR

March 22, 2023

Questions

JavaScript Identify Status For Months Starting Later

byMR

March 22, 2023

Questions

How do I stop the recording automatically after 15 seconds if the user hasn't stopped the recording?

byMR

March 22, 2023

Questions

How to prove that a functional is injective in Agda?

byMR

March 22, 2023

Pandas truncates strings in numpy list

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

convert a dictionary of pandas date frame into one excel file with different sheets

How do I wipe python completely and get a completely fresh installation?

filter objects with random number keys in Es6

JavaScript Identify Status For Months Starting Later

How do I stop the recording automatically after 15 seconds if the user hasn't stopped the recording?

How to prove that a functional is injective in Agda?

Keep Up to Date with the Most Important News

Pandas truncates strings in numpy list

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

convert a dictionary of pandas date frame into one excel file with different sheets

How do I wipe python completely and get a completely fresh installation?

filter objects with random number keys in Es6

JavaScript Identify Status For Months Starting Later

How do I stop the recording automatically after 15 seconds if the user hasn't stopped the recording?

How to prove that a functional is injective in Agda?

Discover more from Dev solutions