- Pandas.read_csv(filepath_or_buffer,*,Sept=_NoDefault.no_default,delimiter=none,Header='close',names=_NoDefault.no_default,index_col=none,uses=none,to squeeze=none,prefix=_NoDefault.no_default,mangle_dupe_cols=TRUE,dtyp=none,Motor=none,Converter=none,true values=none,Wrong values=none,skipinitialspace=INCORRECT,skip lines=none,jumping foot=0,sews=none,na_values=none,keep_default_na=TRUE,on_filter=TRUE,detailed=INCORRECT,skip_blank_lines=TRUE,parse_dates=none,infer_datetime_format=INCORRECT,keep_date_col=INCORRECT,date_parser=none,tagerst=INCORRECT,cache_dates=TRUE,Iterator=INCORRECT,piece size=none,Compression='close',thousands=none,Decimal='.',line termination=none,quotation marks=''',quote=0,double quote=TRUE,Fluchtchar=none,Comment=none,coding=none,coding error='strictly',dialect=none,error_bad_lines=none,warn_bad_lines=none,on_bad_lines=none,delim_whitespace=INCORRECT,little memory=TRUE,memory_map=INCORRECT,float_precision=none,storage options=none)[Those]#
Read a comma separated values (csv) file in DataFrame.
Also supports optionally iterating or splitting the file into chunks.
For more help, see the online docs forIO-Tools.
- Parameter
- filepath_or_bufferstr, path object or file-like object
Any valid string path is acceptable. The string could be a URL. ValidURL schemes include http, ftp, s3, gs, and file. A host is expected for file URLs. A local file could be:file://localhost/path/to/table.csv.
If you want to pass a path object, pandas accepts any
os.PathLike
.With file-like object we refer to objects with a
read()
method, such as B. a file handle (e.g. via builtinopen
function) orStringIO
.- Septstr, Standard ‘,’
to use separators. If sep is None, the C engine cannot automatically detect the delimiter, but the Python parsing engine can, which means that the latter is used and the delimiter is automatically detected by Python's built-in sniffer tool.
csv.-Sniffer
. Also, delimiters can be longer than 1 character and different'\s+'
are interpreted as regular expressions and also enforce the use of the Python parsing engine. Note that regex delimiters tend to ignore quoted data. Regex example:'\r\t'
.- delimiterstr, Standard
none
Alias for Sept.
- Headerint, Liste von int, None, Standard „infer“
Row number(s) to use as column names and start of data. The default behavior is to infer the column names: if no names are passed, the behavior is identical to
Header = 0
and column names are derived from the first line of the file, if column names are passed explicitly the behavior is identical toHeader = None
. Passed explicitlyHeader = 0
to replace existing names. The header can be a list of integers specifying the row positions for a multiple index on the columns, e.g. [0,1,3]. Intermediate lines not specified are skipped (e.g. 2 is skipped in this example). Note that this parameter ignores commented lines and blank lines whenskip_blank_lines=True
, SoHeader = 0
denotes the first line of data and not the first line of the file.- namesArray-artig, optional
List of column names to use. If the file contains a header, you should pass it explicitly
Header = 0
to overwrite the column names. Duplicates in this list are not allowed.- index_colint, str, sequence of int/str or False, optional, default
none
Column(s) to use as row labels
data frame
, either as a string name or as a column index. If a sequence of int / str is specified, a multiindex is used.Note:
index_col=False
can be used to force pandas to do thisnotUse the first column as an index, e.g. if you have a bad file with delimiters at the end of each line.- usesList-like or callable, optional
Returns a subset of the columns. If it is a list, all elements must be either positional information (i.e. integer indexes into the document columns) or strings corresponding to the column names, either provided by the user innamesorderived from the document header(s). If
names
are specified, the document header(s) are ignored. For example, a valid list-likeusesparameters would be[0, 1, 2]
or['Foo', 'Bar', 'baz']
.Element order is ignored, sousecols=[0, 1]
is the same as[1, 0]
.How to instantiate a DataFrame fromData
with preserved element orderpd.read_csv (data, usecols=['foo', 'bar'])[['foo', 'Bar']]
for columns in['Foo', 'Bar']
order orpd.read_csv (data, usecols=['foo', 'Barbara', 'foo']]
for['Bar', 'Foo']
Command.If callable, the callable is evaluated against the column names, returning names where the callable evaluates to True. An example of a valid callable argument would be
Lambda X: x.upper() In['AAA', 'BBB', 'DDD']
. Using this parameter results in much faster parsing time and lower memory usage.- to squeezebool, default False
If the analyzed data contains only one column, return a series.
Deprecated since version 1.4.0:append
.squeeze("columns")
to callread_csv
to squeeze the data.- prefixstr, optional
Prefix to add to column numbers when there is no heading, e.g. 'X' for X0, X1, ...
Deprecated since version 1.4.0:After the call, use a list comprehension of the DataFrame's columns
read_csv
.- mangle_dupe_colsbool, default True
Duplicate columns are specified as "X", "X.1", ... "X.N" and not as "X" ... "X". Passing False will result in data being overwritten if there are duplicate names in the columns.
Deprecated since version 1.5.0:Not implemented, and instead a new argument is added to specify the pattern for duplicate column names
- dtypEnter column name or dictation -> type, optional
Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32, 'c': 'Int64'} UseStrorObjectalong with suitablena_valuesPreferences to preserve and not interpret dtype. If converters are specified, they are applied INSTEAD of dtype conversion.
Neu in Version 1.5.0:Added support for defaultdict. Provide as input a defaultdict, where the default determines the dtype of columns not explicitly listed.
(Video) 3 Tips to Read Very Large CSV as Pandas Dataframe | Python Pandas Tutorial- Motor{'c', 'python', 'pyarrow'}, optional
Parser engine to use. The C and Pyarrow engines are faster while the Python engine is currently more comprehensive. Multithreading is currently only supported by the Pyarrow engine.
Neu in Version 1.4.0:The "pyarrow" engine was added as oneExperimental-engine, and some features are not supported or may not work properly with this engine.
- ConverterDiktat, optional
Dictation of functions to convert values in specific columns. Keys can be either integers or column labels.
- true valuesListe, optional
Values to be considered as True.
- Wrong valuesListe, optional
Values to be considered False.
- skipinitialspacebool, default False
Skip spaces after the delimiter.
- skip lineslistlike, int or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the beginning of the file.
If callable, the callable function is evaluated against the row indices and returns True if the row should be skipped and False otherwise. An example of a valid callable argument would be
Lambda X: X In [0, 2]
.- jumping footInteger, default 0
Number of lines to skip at the end of the file (Not supported with engine='c').
- sewsint, optional
Number of file lines to read. Useful for reading parts of large files.
- na_valuesscalar, str, list-like or dict, optional
Additional strings to be recognized as NA/NaN. If dict passed, specific NA values per column. By default, the following values are interpreted as NaN: '', '#N/A', '#N/A N/A', '#NA', '-1.#IND', '-1.#QNAN' , -NaN ', '-nan', '1.#IND', '1.#QNAN', '<NA>', 'N/A', 'NA', 'NULL', 'NaN', 'n/a' ,'nan', 'zero'
- keep_default_nabool, default True
Whether or not to include the default NaN values when parsing the data. Depending on whetherna_valuesis passed, the behavior is as follows:
Ifkeep_default_nais true andna_valuesare specified,na_valuesis appended to the default NaN values used for parsing.
Ifkeep_default_nais true andna_valuesare not specified, only the default NaN values are used for parsing.
Ifkeep_default_nais wrong, andna_valuesare specified, only the specified NaN valuesna_valuesare used for parsing.
Ifkeep_default_nais wrong, andna_valuesare not specified, nostrings are parsed as NaN.
Notice that ifon_filterwhich is passed as Falsekeep_default_naAndna_valuesParameters are ignored.
- on_filterbool, default True
Detection of missing value markers (empty strings and the value of na_values). Indata without NAs, passing na_filter=False can improve performance when reading a large file.
- detailedbool, default False
Specify the number of NA values placed in non-numeric columns.
- skip_blank_linesbool, default True
If True, blank lines are skipped instead of being interpreted as NaN values.
- parse_datesbool or list of int or names or list of listens or dict, default False
The behavior is as follows:
(Video) It's Here - Pandas 2.0 Extended First Look on Live Streamboolean. If True -> try parsing the index.
List of int or names. e.g. If [1, 2, 3] -> try to parse columns 1, 2, 3 each as a separate date column.
list of lists. e.g. If [[1, 3]] -> Combine columns 1 and 3 and parse as a single date column.
dictation, e.g. {'foo' : [1, 3]} -> Parse columns 1, 3 as a date and call the result 'foo'
If a column or index cannot be represented as an array of dates and times, for example because of an unparsable value or a mix of time zones, the column or index is returned unchanged as an object data type. Use for non-standard datetime parsing
pd.to_datetime
afterpd.read_csv
. To parse an index or column with a mix of time zones, specifydate_parser
are partially appliedpandas.to_datetime()withutc=true
. SeeParsing a CSV file with mixed time zonesfor more.Note: There is a shortcut for iso8601 formatted dates.
- infer_datetime_formatbool, default False
If true andparse_datesis enabled, pandas will try to infer the format of the datetime strings in the columns, and if it can be inferred, switch to a faster method of parsing. In some cases, this can increase parsing speed by 5x to 10x.
- keep_date_colbool, default False
If true andparse_datesspecifies to combine multiple columns and then keep the original columns.
- date_parserfunction, optional
Function to convert a sequence of string columns into an array of datetime instances. The default usages
dateutil.parser.parser
to do the conversion. Pandas will try to calldate_parserin three different ways to proceed to the next when an exception occurs: 1) Pass in one or more arrays (as defined byparse_dates) as arguments; 2) concatenate (row by row) the string values from the columns defined byparse_datesinto a single array and pass that; and 3) calldate_parseronce for each row of one or more strings (corresponding to the columns defined by ).parse_dates) as arguments.- tagerstbool, default False
Dates in DD/MM format, international and European format.
- cache_datesbool, default True
If True, use a cache of unique converted dates to apply the datetimeconversion. Can result in significant speedup when parsing duplicate date strings, especially those with time zone offsets.
Neu in Version 0.25.0.
- Iteratorbool, default False
Returns the TextFileReader object for iteration or chunk retrieval
get_chunk()
.Changed in version 1.2:
TextFileReader
is a context manager.- piece sizeint, optional
Return the TextFileReader object for iterationIO Tools-Dokumentationfor more information about
Iterator
Andpiece size
.Changed in version 1.2:
TextFileReader
is a context manager.- Compressionstr or dict, default "infer"
For spontaneous decompression of data on the hard disk. If 'infer' and 'filepath_or_buffer' are path-like then detect compression of the following extensions: '.gz', '.bz2', '.zip', '.xz', '.zst', '.tar', ' .tar.gz', '.tar.xz' or '.tar.bz2' (otherwise no compression). When using 'zip' or 'tar', the ZIP file may only contain one file to be read. Set
none
for no decompression. Can also be a dictation with a key'Method'
setto one of {'Zipper'
,'gzip'
,'bz2'
,'zstd'
,'Teer'
} and other key-value pairs are passedzipfile.ZipDatei
,gzip.GzipDatei
,bz2. BZ2Datei
,zstandard.ZstdDecompressor
orTarfile.TarFile
. As an example, the following could be passed for z-standard decompression using a custom compression dictionary:compression = {'method': 'zstd', 'dict_data': my_compression_dict}
.Neu in Version 1.5.0:Added support for.Teerfiles.
Changed in version 1.4.0:Zstandard support.
- thousandsstr, optional
thousands separator.
(Video) Installing Python Pandas on Macbook- Decimalstr, Standard „.“
Character to be recognized as a decimal point (e.g. use "," for European dates).
- line terminationstr (length 1), optional
Character to split the file into lines. Only valid with C parser.
- quotation marksstr (length 1), optional
The character used to denote the beginning and end of a quoted item. Quotation marks can contain the delimiter and are ignored.
- quoteint or csv.QUOTE_* instance, default 0
Controller quoting behavior per
csv.QUOTE_*
constants. Use one of QUOTE_MINIMAL (0), QUOTE_ALL (1), QUOTE_NONNUMERIC (2), or QUOTE_NONE (3).- double quotebool, Standard
TRUE
When quotechar is specified and quoting is not
QUOTE_NONE
, indicate whether two consecutive quotechar elements INSIDE afield should be interpreted as one or notquotation marks
Element.- Fluchtcharstr (length 1), optional
A string used to escape other characters.
- Commentstr, optional
Specifies that the rest of the row should not be parsed. If found at the beginning of a line, the line is ignored entirely. This parameter must be a single character. Like blank lines (as long as
skip_blank_lines=True
), fully commented lines are ignored by the parameterHeaderbut not throughskip lines. For example whenComment='#'
, presses#read\na,b,c\n1,2,3
withHeader = 0
causes "a,b,c" to be treated as a header.- codingstr, optional
Encoding for UTF on read/write (e.g. "utf-8").List of Python standard encodings.
Changed in version 1.2:If
coding
Isnone
,error = "replace"
is handed overopen()
. Otherwise,error = "strict"
is handed overopen()
.This behavior was previously only atengine="python"
.Changed in version 1.3.0:
coding error
is a new argument.coding
no longer affects the handling of coding errors.- coding errorstr, optional, Standard „streng“
How coding errors are handled.List of possible values.
Neu in Version 1.3.0.
- dialectstr or csv. dialect, optional
If specified, this parameter overrides values (default or not) for the following parameters:delimiter,double quote,Fluchtchar,skipinitialspace,quotation marks, Andquote. If values need to be overwritten, a ParserWarning is issued. See the csv.Dialect documentation for more details.
- error_bad_linesbool, optional, Standard
none
By default, rows with too many fields (e.g. a CSV row with too many commas) will result in an exception being thrown and no DataFrame will be returned. If False, these "bad rows" are deleted from the returned DataFrame.
Deprecated since version 1.3.0:The
on_bad_lines
The parameter should instead be used to specify behavior when a faulty line occurs.- warn_bad_linesbool, optional, Standard
none
If error_bad_lines is False and warn_bad_lines is True, a warning is issued for each bad line.
Deprecated since version 1.3.0:The
on_bad_lines
The parameter should instead be used to specify behavior when a faulty line occurs.- on_bad_lines{'error', 'warn', 'skip'} or callable, default 'error'
Specifies what to do when a bad row (a row with too many fields) is found. Allowed values are:
'error', throws an exception if an erroneous line is encountered.
"Warn", issue a warning when a bad line is encountered and skip that line.
(Video) How to Download & Install Pandas for Python on Windows10/11 Latest Ver.2023"skip", skipping bad lines without raising or warning when encountered.
Neu in Version 1.3.0.
Neu in Version 1.4.0:
callable, function with signature
(bad line: List[str]) -> lists[str] | none
that will handle a single bad line.bad line
is a list of strings divided by theSept
.When the function returnsnone
, the erroneous line is ignored. If the function returns a new list of strings with more elements than expected, aParserWarnung
is issued when dropping additional items. Only supported ifengine="python"
- delim_whitespacebool, default False
Indicates whether spaces (eg.
' '
or' '
) is used as Sep. Corresponds to settingsep='\s+'
. If this option is set to True, nothing should be passed for thedelimiter
Parameter.- little memorybool, default True
Internally process the file in chunks, resulting in less memory usage when parsing but potentially mixed type inference. To ensure that there are no mixed types, either set False or include the typedtypParameter. Note that the entire file is read into a single DataFrame regardless, use thepiece sizeorIteratorParameters to return the data in blocks. (Only valid with C parser).
- memory_mapbool, default False
If a file path is providedfilepath_or_buffer, map the file object directly to storage and access the data directly from there. Using this option can improve performance by eliminating I/O overhead.
- float_precisionstr, optional
Specifies which converter the C engine should use for floating point values. The options are
none
or "high" for the ordinary converter, "legacy" for the original Pandas converter with lower precision, and "round_trip" for the round-trip converter.Changed in version 1.2.
- storage optionsDiktat, optional
Additional options that make sense for a specific storage connection, e.g. host, port, username, password, etc. For HTTP(S) URLs, the key-value pairs are forwarded
urllib.request.Request
as header options. For other URLs (e.g. starting with "s3://" and "gcs://"), the key-value pairs are forwardedfsspec.open
. please seefsspez
Andscreaming
For more details and more examples of storage options, seeHere.Neu in Version 1.2.
- Returns
- DataFrame or TextParser
A comma-separated values (csv) file is returned as a two-dimensional data structure with labeled axes.
See also
- DataFrame.to_csv
Write DataFrame to a comma separated values (csv) file.
- read_csv
Read a comma separated values (csv) file in DataFrame.
- read_fwf
Reads a table with formatted fixed-width rows into the DataFrame.
examples
>>>pd.read_csv('Data.csv')
(Video) How to install Pandas on Python 3.10 Windows 10
pandas.read_csv - Pandas 1.5.3 documentation (2023)
Videos
1. 🗒️+🐼 ¿Cómo cambiar los nombres de las columnas de un Dataframe? | ⏰ En menos de 60 segundos | Shorts
(cctmexico)
2. Setup A Python Environment Locally from Scratch - Part 2
(Data Science Foundations)
3. สอน pandas: save DataFrame ไปเป็นตารางใน MS SQL Server
(prasertcbs)
4. How To Activate Panda Mouse Pro In Android 12 With Single Mobile💯 All Problems Solved🥳
(TWO BROTHERS GAMING)
5. 16x52 HD 16x Magnification Zoom Monocular by ARCHEER Review
(Peter von Panda)
6. Customizing The ArcGIS Pro Interface - June 2022
(Panda Consulting)
Top Articles
22 top innovations for 2022 | IN-PART blog
Error 0x800701B1 on Windows 10 [Solved] - Best IT Guide
Database as a Service (DBaaS) explained
What is Database as a Service (DBaaS)?
Deir Rafat, die geistliche Lunge der Diözese Heiliges Land
Pirelli MT60 RS review
Pirelli MT60 tire test - webBikeWorld
Latest Posts
22 top innovations for 2022 | IN-PART blog
Error 0x800701B1 on Windows 10 [Solved] - Best IT Guide
How to Fix Error Code 0X800701B1 on Windows 10
Inventions that have to be made and should exist
8 Life-Changing Inventions... That Haven't Been Invented Yet
Article information
Author: Van Hayes
Last Updated: 02/04/2023
Views: 5821
Rating: 4.6 / 5 (46 voted)
Reviews: 85% of readers found this page helpful
Author information
Name: Van Hayes
Birthday: 1994-06-07
Address: 2004 Kling Rapid, New Destiny, MT 64658-2367
Phone: +512425013758
Job: National Farming Director
Hobby: Reading, Polo, Genealogy, amateur radio, Scouting, Stand-up comedy, Cryptography
Introduction: My name is Van Hayes, I am a thankful, friendly, smiling, calm, powerful, fine, enthusiastic person who loves writing and wants to share my knowledge and understanding with you.