How to limit the number of features read in using GeoPandas?

I have the following Python code to read my shapefile features into a GeoDataFrame using the points x, y.

import math
import shapely.geometry
import geopandas as gpd
from shapely.ops import nearest_points

absolute_path_to_shapefile = 'c:/test/test1.shp'

gdf1 = gpd.read_file(absolute_path_to_shapefile)
gdf = gpd.GeoDataFrame(
    gdf1, geometry=gpd.points_from_xy(gdf1['x'], gdf1['y']))

Is there a way to limit the features read in? Some shapefiles have millions of points but I just want to read in the first 100 as proof of concept.

>Solution :

GeoPandas read_file() has a rows option to limit the number of rows read (or to use a slice to read specific rows).

import math
import shapely.geometry
import geopandas as gpd
from shapely.ops import nearest_points

absolute_path_to_shapefile = 'c:/test/test1.shp'

gdf1 = gpd.read_file(absolute_path_to_shapefile, rows=100)
gdf = gpd.GeoDataFrame(gdf1, geometry=gpd.points_from_xy(gdf1['x'], gdf1['y']))

GeoPandas documentation

geopandas.read_file(filename, bbox=None, mask=None, rows=None, **kwargs)
Returns a GeoDataFrame from a file or URL.

Parameters
filename: str, path object or file-like object
Either the absolute or relative path to the file or URL to be opened, or any object with a read() method (such as an open file or StringIO)

bbox: tuple | GeoDataFrame or GeoSeries | shapely Geometry, default None
Filter features by given bounding box, GeoSeries, GeoDataFrame or a shapely geometry. CRS mis-matches are resolved if given a GeoSeries or GeoDataFrame. Tuple is (minx, miny, maxx, maxy) to match the bounds property of shapely geometry objects. Cannot be used with mask.

mask: dict | GeoDataFrame or GeoSeries | shapely Geometry, default None
Filter for features that intersect with the given dict-like geojson geometry, GeoSeries, GeoDataFrame or shapely geometry. CRS mis-matches are resolved if given a GeoSeries or GeoDataFrame. Cannot be used with bbox.

rows: int or slice, default None
Load in specific rows by passing an integer (first n rows) or a slice() object.

**kwargs :
Keyword args to be passed to the open or BytesCollection method in the fiona library when opening the file. For more information on possible keywords, type: import fiona; help(fiona.open)

Returns
geopandas.GeoDataFrame or pandas.DataFrame :

If ignore_geometry=True a pandas.DataFrame will be returned.

Leave a Reply