Learning COBOL with Examples - Part 1: Basics and Hello World
Learning a new programming language and specially a really really old and obscure one is hard. This is especially true with COBOL since most documentation is extremely old, outdated and there are essentially no other users. This means just searching for your problem on any arbitrary search engine will probably not solve your issue, like it would with C or C++ problems. However learning a new language can also be fun and is educational, as there are many old programs out there written in COBOL.
These upcoming blog entries and this one is there to help get you off the ground. I'm not saying, that COBOL is my favourite language and I'm not saying that my code it good. But I still hope these posts will help you understand some basics of the COBOL language. I'm gonna spare you the historic parts.
I'm a C programmer. My background is in Operating Systems and bare-metal programming.
The thumbnail-image is from [3].
Compiler
You will need a compiler, which translates the COBOL source code into natively executeable programs. I'm using GnuCOBOL [1]. On their website they also have handy reference sheets if you ever need to look up the grammar of COBOL. The Forum on SourceForge is also one of the only places you find people asking for COBOL related problems.
In Arch Linux, gnu-cobol is not in the official repositories, but it can be found in an AUR [2]. Your distribution will probably also have a package ready for gnu-cobol.
I'm going to use a few flags as default, unless stated otherwise.
cobc -x --free --Wall inputfile.cob
The ''-x'' instructs the compiler to create an executable file. ''--free'' will make it use the free form of COBOL. Old versions of COBOL used a set form on punch cards, where the first so-and-so characters where the line number, then a special field for comments, code, bla. To make COBOL look more like a modern-ish programming language I'm going to use the free form, so I'm able to have very long lines and start at column 0. And ''--Wall'' will print out all warning messages the compiler can produce. This will help improving our code.
A first very very simple program
This is a simple program, displaying "Hello World" and the truth about everything. It is more than a normal "Hello World" program to show you more Divisions of a COBOL program.
IDENTIFICATION DIVISION.
PROGRAM-ID. truth.
ENVIRONMENT DIVISION.
CONFIGURATION SECTION.
SOURCE-COMPUTER. IBM-PC.
OBJECT-COMPUTER. IBM-PC.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 truth PIC IS 99.
PROCEDURE DIVISION.
begin.
DISPLAY "Hello World".
MOVE 42 TO truth.
DISPLAY truth.
EXIT PROGRAM.
The first things, that will probably hit your eye is the fact, that the code is divided into so called Divisions. In Divisions you have Sections, then Paragraphs, Sentences, Verbs and Characters. COBOL was designed to be quite close to the English language, so you will probably understand most of the operations already.
You see four divisions in the code above:
- IDENTIFICATION DIVISION - for identifying the program. There is not much of use in there today. The PROGRAM-ID has to be set to be usable from other programs (a topic for another time).
- ENVIRONMENT DIVISION - for identifying the environment. The configuration section is there for defining source and target computers. This is not really relevant anymore and can be left out of your programs easily. However the Environment division will be used later for File I/O.
- DATA DIVISION - your variables and data structures will be defined here (inside the WORKING-STORAGE SECTION).
- PROCEDURE DIVISION - this is where the actual code is.
Defining some Variables (Data Division)
In the program above the DATA DIVISION and the PROCEDURE DIVISION is interesting. Let's start with the DATA DIVISION, as I am defined a variable in there, named ''truth''. When defining data structures, the syntax is as follows:
<level> <name> PIC [IS] <type> [VALUE [IS] <value>].
There are more statements, that can be made, but they are omitted here. COBOL works a bit differently when describing data structures than other languages. Basically nearly everything in COBOL is a record (or struct). The level indicates, which level this entry is on. You can have anything from 01 to 49, 66, 77 and 88 in there.
Level | Meaning |
---|---|
01 | record level, top level |
02-49 | elementary and sub-types |
66 | rename clause |
77 | items, which cannot be sub-divided |
88 | condition name entry |
Let's focus on the levels 01 to 49. With the level one can create deeper data structures. Smaller numbers are names for the group of variables of the next lower numbers. For example one can build a structure as follows:
DATA DIVISION.
WORKING-STORAGE SECTION.
01 person.
02 name PIC IS X(20).
02 tel PIC IS 9(10).
The name is any arbitrary name you want to give to the variable. The PIC clause identifies the data type of the variable. A variable in COBOL is always compounded. You can have one of always the same types, but you can define ones with several different types in different positions, so you can have numbers with signs or decide where the comma is. But you have to do that statically.
So let's say you want to define a variable, which can store a year with four numbers, then the following could a valid definition. Both of the statements are identical.
DATA DIVISION.
WORKING-STORAGE SECTION.
01 year PIC IS 9999.
01 year2 PIC IS 9(4).
Now if you know other programming languages already, the notation is a bit weird. There is no data type, is there? Oh, yes, there is. But it is written down differently than in C. As said, the PIC-clause determines the type of the variable. The following types are available:
Symbol | Meaning |
---|---|
9 | Numeric |
A | Alphabetic |
P | Decimal |
S | Sign |
V | Decimal point (implicit) |
X | Alphanumeric |
Among other things, there is a VALUE-clause after PIC, there can be even more. The VALUE clause sets a preset value, the variable will have when the program starts.
Writing a Program (Procedure Division)
Now let's take a gander at the Procedure Division of the above example.
PROCEDURE DIVISION.
begin.
DISPLAY "Hello World".
MOVE 42 TO truth.
DISPLAY truth.
STOP RUN.
It consists of two notable things - the first is the label ''begin''. This is a label like one might know from Assembly-code or really really bad C-code. This allows your code to jump to that exact place with the help of the ''GO TO'' verb. But more on that later.
Then the statements basically look the same. There is a Verb and an/several operand/s. ''DISPLAY'' will print the string/object after it to stdout, ''MOVE'' is a copying operator, like the ''mov''-instruction in most ISAs. The direction of the move is written clearly in the rest of the sentence. ''42 TO truth'' - so you probably already knew what this sentence did, before reading this far. ''STOP RUN'' stops the program, much like ''exit()'' in C.
There are loads of Verbs and Builtin function in modern COBOL, so listing them all here is not only redundant but also nearly impossible.
Comments
One really important thing are comments. While being omitted in the above example, one can use the following syntax to comment lines.
*> This is a comment, which will be ignored by the compiler
DISPLAY "HELLO WORLD" *> Now Display will be compiled, but this comment not
Conclusion
So there we have it. This post showed a very simple COBOL program and tried to describe the concepts generically. If you find errors, don't hesitate to contact me (English or German, both fine).
References
[1] https://open-cobol.sourceforge.io/
[2] https://aur.archlinux.org/packages/gnu-cobol/
[3] http://deacademic.com/dic.nsf/dewiki/873823