01. BASH Unix Shell

Generalised Linear Models

©2018 Raazesh Sainudiin. Attribution 4.0 International (CC BY 4.0)

  1. Dropping into BASH (Unix Shell) and using basic Shell commands
    • pwd --- print working directory
    • ls --- list files in current working directory
    • mkdir --- make directory
    • cd --- change directory
    • man ls --- manual pages for any command
  2. Grabbing files from the internet using curl
In [36]:
def showURL(url, ht=500):
    """Return an IFrame of the url to show in notebook with height ht"""
    from IPython.display import IFrame
    return IFrame(url, width='95%', height=ht) 
showURL('https://en.wikipedia.org/wiki/Bash_(Unix_shell)',400)
Out[36]:

1. Dropping into BASH (Unix Shell)

Using %%sh in a code cell we can access the BASH (Unix Shell) command prompt.

Let us pwd or print working directory.

In [37]:
%%sh
pwd
/home/raazesh/all/git/scalable-data-science/_glm/2018/jp
In [41]:
%%sh
# this is a comment in BASH shell as it is preceeded by '#'
ls  # list the contents of this working directory
00.html
00.ipynb
00.md
01.html
01.ipynb
01.md
02.html
02.ipynb
02.md
imagesFromR
mydir
In [43]:
%%sh
mkdir mydir
In [48]:
%%sh
cd mydir
pwd
ls -al
/home/raazesh/all/git/scalable-data-science/_glm/2018/jp/mydir
total 3620
drwxr-xr-x 3 raazesh raazesh    4096 Nov  6 14:09 .
drwxr-xr-x 5 raazesh raazesh    4096 Nov  6 15:59 ..
-rw-r--r-- 1 raazesh raazesh   29323 Nov  6 09:28 20170228.txt
drwx------ 2 raazesh raazesh   12288 Feb 18  2016 sou
-rw-r--r-- 1 raazesh raazesh 3652403 Nov  6 14:09 sou.tar.gz
In [45]:
%%sh
pwd
/home/raazesh/all/git/scalable-data-science/_glm/2018/jp
In [46]:
%%sh
man ls
LS(1)                            User Commands                           LS(1)

NAME
       ls - list directory contents

SYNOPSIS
       ls [OPTION]... [FILE]...

DESCRIPTION
       List  information  about  the FILEs (the current directory by default).
       Sort entries alphabetically if none of -cftuvSUX nor --sort  is  speci‐
       fied.

       Mandatory  arguments  to  long  options are mandatory for short options
       too.

       -a, --all
              do not ignore entries starting with .

       -A, --almost-all
              do not list implied . and ..

       --author
              with -l, print the author of each file

       -b, --escape
              print C-style escapes for nongraphic characters

       --block-size=SIZE
              scale sizes by SIZE before printing them; e.g., '--block-size=M'
              prints sizes in units of 1,048,576 bytes; see SIZE format below

       -B, --ignore-backups
              do not list implied entries ending with ~

       -c     with -lt: sort by, and show, ctime (time of last modification of
              file status information); with -l: show ctime and sort by  name;
              otherwise: sort by ctime, newest first

       -C     list entries by columns

       --color[=WHEN]
              colorize  the output; WHEN can be 'always' (default if omitted),
              'auto', or 'never'; more info below

       -d, --directory
              list directories themselves, not their contents

       -D, --dired
              generate output designed for Emacs' dired mode

       -f     do not sort, enable -aU, disable -ls --color

       -F, --classify
              append indicator (one of */=>@|) to entries

       --file-type
              likewise, except do not append '*'

       --format=WORD
              across -x, commas -m, horizontal -x, long -l, single-column  -1,
              verbose -l, vertical -C

       --full-time
              like -l --time-style=full-iso

       -g     like -l, but do not list owner

       --group-directories-first
              group directories before files;

              can   be  augmented  with  a  --sort  option,  but  any  use  of
              --sort=none (-U) disables grouping

       -G, --no-group
              in a long listing, don't print group names

       -h, --human-readable
              with -l and/or -s, print human readable sizes (e.g., 1K 234M 2G)

       --si   likewise, but use powers of 1000 not 1024

       -H, --dereference-command-line
              follow symbolic links listed on the command line

       --dereference-command-line-symlink-to-dir
              follow each command line symbolic link

              that points to a directory

       --hide=PATTERN
              do not list implied entries matching shell  PATTERN  (overridden
              by -a or -A)

       --hyperlink[=WHEN]
              hyperlink file names; WHEN can be 'always' (default if omitted),
              'auto', or 'never'

       --indicator-style=WORD
              append indicator with style WORD to entry names: none (default),
              slash (-p), file-type (--file-type), classify (-F)

       -i, --inode
              print the index number of each file

       -I, --ignore=PATTERN
              do not list implied entries matching shell PATTERN

       -k, --kibibytes
              default to 1024-byte blocks for disk usage

       -l     use a long listing format

       -L, --dereference
              when showing file information for a symbolic link, show informa‐
              tion for the file the link references rather than for  the  link
              itself

       -m     fill width with a comma separated list of entries

       -n, --numeric-uid-gid
              like -l, but list numeric user and group IDs

       -N, --literal
              print entry names without quoting

       -o     like -l, but do not list group information

       -p, --indicator-style=slash
              append / indicator to directories

       -q, --hide-control-chars
              print ? instead of nongraphic characters

       --show-control-chars
              show nongraphic characters as-is (the default, unless program is
              'ls' and output is a terminal)

       -Q, --quote-name
              enclose entry names in double quotes

       --quoting-style=WORD
              use quoting style WORD for entry names: literal, locale,  shell,
              shell-always, shell-escape, shell-escape-always, c, escape

       -r, --reverse
              reverse order while sorting

       -R, --recursive
              list subdirectories recursively

       -s, --size
              print the allocated size of each file, in blocks

       -S     sort by file size, largest first

       --sort=WORD
              sort  by  WORD instead of name: none (-U), size (-S), time (-t),
              version (-v), extension (-X)

       --time=WORD
              with -l, show time as WORD instead of default modification time:
              atime  or  access  or  use  (-u); ctime or status (-c); also use
              specified time as sort key if --sort=time (newest first)

       --time-style=STYLE
              with -l, show times using style STYLE: full-iso, long-iso,  iso,
              locale,  or  +FORMAT;  FORMAT  is interpreted like in 'date'; if
              FORMAT  is  FORMAT1<newline>FORMAT2,  then  FORMAT1  applies  to
              non-recent  files  and FORMAT2 to recent files; if STYLE is pre‐
              fixed with 'posix-', STYLE takes effect only outside  the  POSIX
              locale

       -t     sort by modification time, newest first

       -T, --tabsize=COLS
              assume tab stops at each COLS instead of 8

       -u     with  -lt:  sort by, and show, access time; with -l: show access
              time and sort by name; otherwise: sort by  access  time,  newest
              first

       -U     do not sort; list entries in directory order

       -v     natural sort of (version) numbers within text

       -w, --width=COLS
              set output width to COLS.  0 means no limit

       -x     list entries by lines instead of by columns

       -X     sort alphabetically by entry extension

       -Z, --context
              print any security context of each file

       -1     list one file per line.  Avoid '\n' with -q or -b

       --help display this help and exit

       --version
              output version information and exit

       The  SIZE  argument  is  an  integer and optional unit (example: 10K is
       10*1024).  Units are K,M,G,T,P,E,Z,Y  (powers  of  1024)  or  KB,MB,...
       (powers of 1000).

       Using  color  to distinguish file types is disabled both by default and
       with --color=never.  With --color=auto, ls emits color codes only  when
       standard  output is connected to a terminal.  The LS_COLORS environment
       variable can change the settings.  Use the dircolors command to set it.

   Exit status:
       0      if OK,

       1      if minor problems (e.g., cannot access subdirectory),

       2      if serious trouble (e.g., cannot access command-line argument).

AUTHOR
       Written by Richard M. Stallman and David MacKenzie.

REPORTING BUGS
       GNU coreutils online help: <http://www.gnu.org/software/coreutils/>
       Report ls translation bugs to <http://translationproject.org/team/>

COPYRIGHT
       Copyright © 2017 Free Software Foundation, Inc.   License  GPLv3+:  GNU
       GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
       This  is  free  software:  you  are free to change and redistribute it.
       There is NO WARRANTY, to the extent permitted by law.

SEE ALSO
       Full documentation at: <http://www.gnu.org/software/coreutils/ls>
       or available locally via: info '(coreutils) ls invocation'

GNU coreutils 8.28               January 2018                            LS(1)

2. Grabbing files from internet using curl

In [8]:
%%sh
cd mydir
curl -O http://lamastex.org/datasets/public/SOU/sou/20170228.txt
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 29323  100 29323    0     0  29323      0  0:00:01 --:--:--  0:00:01 37836
In [9]:
%%sh
ls mydir/
20170228.txt
In [49]:
%%sh
cd mydir/
head 20170228.txt
Donald J. Trump 

February 28, 2017 
Thank you very much. Mr. Speaker, Mr. Vice President, members of Congress, the first lady of the United States ... 
... and citizens of America, tonight, as we mark the conclusion of our celebration of Black History Month, we are reminded of our nation's path toward civil rights and the work that still remains to be done. 
Recent threats ... 
Recent threats targeting Jewish community centers and vandalism of Jewish cemeteries, as well as last week's shooting in Kansas City, remind us that while we may be a nation divided on policies, we are a country that stands united in condemning hate and evil in all of its very ugly forms. 
Each American generation passes the torch of truth, liberty and justice, in an unbroken chain all the way down to the present. That torch is now in our hands. And we will use it to light up the world. 
I am here tonight to deliver a message of unity and strength, and it is a message deeply delivered from my heart. A new chapter ... 
... of American greatness is now beginning. A new national pride is sweeping across our nation. And a new surge of optimism is placing impossible dreams firmly within our grasp. What we are witnessing today is the renewal of the American spirit. Our allies will find that America is once again ready to lead. 

To have more fun with all SOU addresses

Do the following:

In [32]:
%%sh
mkdir -p mydir # first create a directory called 'mydir'
cd mydir # change into this mydir directory
rm -f sou.tar.gz # remove any file in mydir called sou.tar.gz
rm -f sou.tgz # remove any file in mydir called 
curl -O http://lamastex.org/datasets/public/SOU/sou.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 3566k  100 3566k    0     0   445k      0  0:00:08  0:00:08 --:--:--  502k
In [33]:
%%sh
pwd
ls -lh mydir
/home/raazesh/all/git/scalable-data-science/_glm/2018/jp
total 3.6M
-rw-r--r-- 1 raazesh raazesh  29K Nov  6 09:28 20170228.txt
-rw-r--r-- 1 raazesh raazesh 3.5M Nov  6 14:09 sou.tar.gz
In [35]:
%%sh
cd mydir 
tar zxvf sou.tar.gz
sou/
sou/18111105.txt
sou/20040120.txt
sou/19061203.txt
sou/18411207.txt
sou/19091207.txt
sou/18701205.txt
sou/19410106.txt
sou/18571208.txt
sou/18891203.txt
sou/18341201.txt
sou/19660112.txt
sou/17981208.txt
sou/19610130.txt
sou/18140920.txt
sou/18011208.txt
sou/18811206.txt
sou/18281202.txt
sou/19840125.txt
sou/18611203.txt
sou/18731201.txt
sou/19400103.txt
sou/19630114.txt
sou/19281204.txt
sou/19221208.txt
sou/19031207.txt
sou/18681209.txt
sou/18431206.txt
sou/18861206.txt
sou/19261207.txt
sou/19271206.txt
sou/19141208.txt
sou/18791201.txt
sou/19131202.txt
sou/19041206.txt
sou/18001111.txt
sou/18041108.txt
sou/20010227.txt
sou/18621201.txt
sou/19251208.txt
sou/19700122.txt
sou/19790125.txt
sou/19870127.txt
sou/20050202.txt
sou/18331203.txt
sou/17961207.txt
sou/18021215.txt
sou/18771203.txt
sou/19890209.txt
sou/18301206.txt
sou/18121104.txt
sou/19580109.txt
sou/20110125.txt
sou/19450106.txt
sou/18031017.txt
sou/19301202.txt
sou/18661203.txt
sou/19520109.txt
sou/19620111.txt
sou/18531205.txt
sou/19610112.txt
sou/19430107.txt
sou/19960123.txt
sou/17911025.txt
sou/18211203.txt
sou/18951207.txt
sou/18901201.txt
sou/18721202.txt
sou/20140128.txt
sou/18361205.txt
sou/18101205.txt
sou/18081108.txt
sou/18961204.txt
sou/18871206.txt
sou/18781202.txt
sou/19480107.txt
sou/19001203.txt
sou/18421206.txt
sou/18241207.txt
sou/18131207.txt
sou/19500104.txt
sou/20010920.txt
sou/19940125.txt
sou/19850206.txt
sou/18541204.txt
sou/17921106.txt
sou/19800121.txt
sou/19311208.txt
sou/18461208.txt
sou/19161205.txt
sou/19121203.txt
sou/19370106.txt
sou/19151207.txt
sou/19051205.txt
sou/19021202.txt
sou/18321204.txt
sou/18671203.txt
sou/18651204.txt
sou/19510108.txt
sou/18581206.txt
sou/18161203.txt
sou/19390104.txt
sou/19321206.txt
sou/18641206.txt
sou/20070123.txt
sou/18691206.txt
sou/17991203.txt
sou/18551231.txt
sou/19440111.txt
sou/19910129.txt
sou/18921206.txt
sou/18061202.txt
sou/19470106.txt
sou/19590109.txt
sou/18151205.txt
sou/18751207.txt
sou/18981205.txt
sou/20090224.txt
sou/18071027.txt
sou/18171212.txt
sou/18821204.txt
sou/19211206.txt
sou/18371205.txt
sou/19181202.txt
sou/19720120.txt
sou/18601203.txt
sou/19530107.txt
sou/18741207.txt
sou/19460121.txt
sou/19350104.txt
sou/19201207.txt
sou/18591219.txt
sou/18221203.txt
sou/19231206.txt
sou/19730202.txt
sou/19291203.txt
sou/19820126.txt
sou/20060131.txt
sou/18501202.txt
sou/17931203.txt
sou/18711204.txt
sou/18441203.txt
sou/17900108.txt
sou/18561202.txt
sou/18381203.txt
sou/19071203.txt
sou/19570110.txt
sou/19171204.txt
sou/18181116.txt
sou/19420106.txt
sou/18201114.txt
sou/20130212.txt
sou/20030128.txt
sou/19340103.txt
sou/19970204.txt
sou/19810116.txt
sou/18841201.txt
sou/19101206.txt
sou/18351207.txt
sou/17951208.txt
sou/19530202.txt
sou/18971206.txt
sou/18231202.txt
sou/18831204.txt
sou/19490105.txt
sou/18991205.txt
sou/18391202.txt
sou/18481205.txt
sou/18631208.txt
sou/19930217.txt
sou/19111205.txt
sou/19740130.txt
sou/20160112.txt
sou/19880125.txt
sou/19690114.txt
sou/19360103.txt
sou/20120124.txt
sou/18091129.txt
sou/19680117.txt
sou/18851208.txt
sou/20100127.txt
sou/19191202.txt
sou/19670110.txt
sou/18451202.txt
sou/19241203.txt
sou/20080128.txt
sou/18291208.txt
sou/19980127.txt
sou/18401205.txt
sou/18471207.txt
sou/19540107.txt
sou/18191207.txt
sou/18801206.txt
sou/18491204.txt
sou/18881203.txt
sou/19950124.txt
sou/19380103.txt
sou/18761205.txt
sou/18931203.txt
sou/18051203.txt
sou/18271204.txt
sou/19900131.txt
sou/18941202.txt
sou/20150120.txt
sou/17941119.txt
sou/18261205.txt
sou/19710122.txt
sou/17901208.txt
sou/19081208.txt
sou/19550106.txt
sou/17971122.txt
sou/19560105.txt
sou/19640108.txt
sou/20020129.txt
sou/19920128.txt
sou/18511202.txt
sou/18311206.txt
sou/19770112.txt
sou/18521206.txt
sou/19760119.txt
sou/18251206.txt
sou/19860204.txt
sou/19830125.txt
sou/19780119.txt
sou/19650104.txt
sou/19990119.txt
sou/20000127.txt
sou/19600107.txt
sou/19011203.txt
sou/19750115.txt
sou/18911209.txt
In [30]:
%%sh
ls -lh mydir/
total 40K
-rw-r--r-- 1 raazesh raazesh 29K Nov  6 09:28 20170228.txt
-rw-r--r-- 1 raazesh raazesh 351 Nov  6 13:51 sou.tar.gz
-rw-r--r-- 1 raazesh raazesh 348 Nov  6 13:54 sou.tgz
In [ ]: