Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating a single-column dataframe in Pandas from a Django queryset

I’m trying to create a dataframe containing the values from field_1 and field_2 in a single column. I haven’t used pandas a whole lot before, so I’m sure this is naive.

# Create a dataset in CSV format
field_names = ["description", "comments"]
writer = csv.writer(open("dataset.csv", "w"), quoting=csv.QUOTE_ALL, delimiter=",")
writer.writerow(field_names)
for instance in Order.objects.all():
    writer.writerow([str(getattr(instance, f)) for f in field_names])

# Read CSV
data_frame = pd.read_csv("dataset.csv", index_col=0)
# Combine columns
df2 = data_frame.apply(lambda x: ", ".join(x[x.notnull()]), axis=1)  

If I’m working with a fairly large dataset, is there a way I can make this more efficient? I would like to eliminate the step that creates the CSV entirely, if possible. If anybody can point me in the right direction, that’d be fantastic.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You really don’t need.

  • The order objects and getattr; use .values_list() to get an iterable of 2-tuples (assuming field_names are actual fields on the model).
  • CSV – now that you have an iterable of 2-tuples, pass them to the DataFrame constructor along with the respective column names.
field_names = ["description", "comments"]
df = pd.DataFrame(Order.objects.all().values_list(field_names), columns=field_names)

Also, you wouldn’t really need Pandas at all…

from django.db.models import F, Value
from django.db.models.functions import Concat

# ... 
my_data = list(
    Order.objects.annotate(
        x=Concat(
            F("description"),
            Value(", "),
            F("comments"),
        )
    ).values_list("x", flat=True)
)

and you have a list of description, comments strings, just like the series you’d get with Pandas.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading