Steganography, Hide Text in Image

Last updated at 15 November 2020 at 6:00pm


In a Cryptography and network security class that I’m doing this semester, I came across steganography. In this article I’ll take take you through how I implemented it using C using least significant bits technique.

Intro

So what is steganography? Steganography is the art of hiding information within something such that the message appears to be something else, or part of something else. Morse code is an example physical steganography where text is encoded using dots and dashes. The figure below shows a bunch of dots and dashes. When decoded, the text Hello world is retrieved.

The message is hidden in plain sight.

In digital steganography we can hide text inside an image in a way that doesn’t distort the image too much. The distortion is barley visible to the eye. Try to spot the difference between the two images shown below.

1 2

Image 2 has the English version of the Kenyan national anthem hidden inside it. The difference is barley visible.

How does an image store it’s data

To implement this you’ll need to understand how image data is stored. An image is composed of pixels. Each pixel has red, green, blue and alpha(transparency) values that determine the color of the pixel . When the pixels are combined they form an image.

The red, green, blue and alpha values range between 0-255. Using RGBA notation, a red pixel will be represented as (255, 0, 0, 255) and a blue pixel as (0, 0, 255, 255).

Say we have two pixels, yellow (255, 255, 0, 255) and pink (255, 192, 203, 255). If we modify were to modify the last two bits of the binary representation of each color we would get a slightly different shade of yellow and pink. The new shades would be yellow (255, 255, 0, 255) and pink (255, 192, 203, 255).

The table below shows it visually, yellow, modified yellow, pink, modified pink respectively.

Pixel Colors
media/pink_pixel.svg media/pink_pixel.svg
R - 11111111 (255) R - 11111100 (252)
G - 11111111 (255) G - 11111100 (252)
B - 00000000 (0) B - 00000011 (3)
A - 11111111 (255) A - 11111100 (252)
media/pink_pixel.svg media/pink_pixel.svg
R - 11111111 (255) R -11111100 (252)
G - 11000000 (192) G - 11000011 (195)
B - 11001011 (203) B - 11001000 (200)
A - 11111111 (255) A - 11111100 (252)

From this information we can see clearly that altering the 2 least significant bits of the image changes the colors but the change is not that visible to the human eye.

This is the technique that we will use to hide our data. For each pixel we can store 8 bits of information, which is equivalent to 1 byte. Therefore we can encode any of the characters of the ascii table, (0 - 255) in a pixel.

Example

Lets encode the text Hi! in the 2x2 image below. The annotations are not part of the image(a pixel can only be one color).

First step is to get the binary representation of the text,

Character Ascii code (decimal) binary
H 72 01001000
i 105 01101001
! 33 00100010

The rgba values of the first pixel are (0, 126, 160, 255) respectively. We’ll encode H to this pixel in the following order, ARGB, as shown in the illustration below

i and ! will be encoded to pixel 1 and 2 respectively. For pixel 3 we can encode a character that will tell the decoder that that is the last character. For my program I went with the null terminator \0 which is just a 0 (00000000).

Implementation

You can find the source code here.

The program has three main sections,

  1. read_bitmap function
  2. enc_char function
  3. dec_char function

And two main structs pixel struct and the bitmap struct.

typedef struct pixel
{
	unsigned char b, g, r, a;
} pixel_t;

typedef struct bitmap
{
	unsigned int offset;
	unsigned int depth;
	unsigned int file_size;
	unsigned int width, height;
	pixel_t * pixels;
} bitmap_t;

Read bitmap

Every bitmap should start with a 14 byte header that stores information about the file. The first two characters of the header should be Band M, we need to check this to ensure that we are indeed reading a valid bitmap file. fread will read 2 bytes each of size 1 from the file pointer and store the characters in header_field.

// BM Header
char header_field [2];
fread(&header_field, 1, 2, file);

if(header_field[0] != 'B' && header_field[1] != 'M')
    return NULL;

Next to the “BM” is the size of the file, which is a 4 byte integer. So we read 4 bytes from the file.

unsigned int file_size;
fread(&file_size, sizeof(unsigned int), 1, file);
bmp->file_size = file_size;

At the 10th byte we have the offset, a 4 byte integer that stores the starting address of the pixel information.

// Offset
unsigned int offset;
fseek(file, 10, SEEK_SET);
fread(&offset, sizeof(unsigned int), 1, file);
bmp->offset = offset;

After the first header is the DIB header info, which has information about the image. The first 4 bytes store the size of this header. The next 8 store the width and the height, each of size 4 bytes. The next 2 bytes store the number of color planes followed by 2 bytes which store the depth of a pixel, number of bits per pixel.

// DIB header info
unsigned int h_size;
fread(&h_size, sizeof(unsigned int), 1, file);

// Width and hieght
unsigned int width, height;
unsigned short planes, depth;
fread(&width, 4, 1, file);
fread(&height, 4, 1, file);
fread(&planes, 2, 1, file);
fread(&depth, 2, 1, file);

The next parts of the header are not really used by the program so we’ll just jump to the pixel array offset using fseek.

fseek(file, offset, SEEK_SET);

Our program reads bitmap whose pixel format is ARGB32, alpha, red, green and blue.

unsigned char r, g, b, a;
int index = 0;
while( ftell(file) != file_size)
{

    fread(&b, sizeof(unsigned char), 1, file);
    fread(&g, sizeof(unsigned char), 1, file);
    fread(&r, sizeof(unsigned char), 1, file);
    fread(&a, sizeof(unsigned char), 1, file);
	...
}

Notice that the order used to read the the pixel is BGRA. You can read more about endianness.

Encoding

To encode a character to a pixel, we first need to determine the binary representation of the char. Below is the method I came up with.

char bin[9] = { '0', '0', '0','0', '0', '0', '0', '0' ,'\0' };
unsigned int cv = c;

for(int i = 7; i > -1; i --)
{
    if(cv % 2 != 0)
    {
        bin[i] = '1';
    }
    else
    {
        bin[i] = '0';
    }

    cv = cv / 2;
}

Next we need to change the 2 least significant bits of the r, g, b and a values of the pixel to zero so that we can add our own values. So if we have 1111111 we convert it to 11111100. To do this all you have to do is

value = value -(value % 4)

px->r = px->r - (px->r % 4);
px->g = px->g - (px->g % 4);
px->b = px->b - (px->b % 4);
px->a = px->a - (px->a % 4);

Now we encode our data. The last two bits can either be 0, 1 , 2 or 3. In the binary representation obtained in the previous step, we add the decimal value of the first two characters to a, next two to r, g then b as shown below

if(bin[0] == '0' && bin[1] == '1') px->a ++;
else if(bin[0] == '1' && bin[1] == '0') px->a += 2;
else if(bin[0] == '1' && bin[1] == '1') px->a += 3;

if(bin[2] == '0' && bin[3] == '1') px->r ++;
else if(bin[2] == '1' && bin[3] == '0') px->r += 2;
else if(bin[2] == '1' && bin[3] == '1') px->r += 3;

if(bin[4] == '0' && bin[5] == '1') px->g ++;
else if(bin[4] == '1' && bin[5] == '0') px->g += 2;
else if(bin[4] == '1' && bin[5] == '1') px->g += 3;

if(bin[6] == '0' && bin[7] == '1') px->b ++;
else if(bin[6] == '1' && bin[7] == '0') px->b += 2;
else if(bin[6] == '1' && bin[7] == '1') px->b += 3;

Decoding

To decode we need to get the 2 least significant bits of the a, r, g and b values.

int a = px->a % 4;
int r = px->r % 4;
int g = px->g % 4;
int b = px->b % 4;

From these values we can now determine the ascii value of the character. e.g

if(a == 1) value += 64;			// 0100 0000
else if(a == 2) value += 128;	// 1000 0000
else if(a == 3) value += 198;	// 1100 0000

if(r == 1) value += 16;			// 0001 0000
else if(r == 2) value += 32;	// 0010 0000
else if(r == 3) value += 48;	// 0011 0000

if(g == 1) value += 4;			// 0000 0100
else if(g == 2) value += 8;		// 0000 1000
else if(g == 3) value += 12 ;	// 0000 1100

if(b == 1) value += 1;			// 0000 0001
else if(b == 2) value += 2;		// 0000 0010
else if(b == 3) value += 3;		// 0000 0011

If you cast the value to a char you’ll get your char, from the above illustration, 72 which is H.


Practical applications

Apart from hiding text inside images, steganography can be used to watermark content. For example if I am a movie director and I need to share my movie to some people for review before the official release, I could hide some unique identifier to each movie file I share. So incase someone decides to get clever and leak the movie, the director can automatically know who leaked the movie.

I hope you enjoyed! Here is the link to the full source code, instructions on how to build and run are in the read me.

References and other material

BMP file format - wikipedia

Steganography - wikipedia

Secrets Hidden in Images (Steganography) - Computerphile(youtube)