HDFS

19 / 58
HDFS - On what basis does the HDFS breaks a big file...

On what basis does the HDFS breaks a big file into chunks?

  • Number of lines
  • Lines and byte size both. The chunk might be a bit smaller than predefined size to have only the full line.
  • The fixed byte size by default 128MB. All the chunks are of the same size (in bytes) except for the last one which may be smaller.
  • You can define your own way of splitting.