Linux Basics for Big Data

5 / 87
Linux Files & Processes

Everything in Unix is either a file or a process.


When you run a program, a process is created. Every process is identified by a number called process ID. To check the processes you are running, execute "ps" command on the shell. You can think of the process ID to be a sequence number given by the operating system. It may be different at different execution of the same program.


A file is a sequence of data. A file could be created by users using word processors or text editors or by the program to keep the information. A program is kept in the form of a file and when it is run by the kernel, it loads as a process.

A file is generally written on the disk so that it exists even after the computer restarts. It is saved in a disk - either hard disk drive (HDD - cheaper and slower) or solid state drive (SSD - faster but costlier).

A file is identified by a name called file path. In Unix, everything is represented as file:

  1. Devices such as Mouse, Keyboard

  2. Programs are saved as file

  3. Disk and Monitor