By Akram on 4:06:00 PM

comments (1)

Filed Under:


The Hurd Hacking Guide







Hurd Hacking Guide



Copyright © 2001, 2002, 2005, 2007 Free Software Foundation, Inc.

Written by Wolfgang Jährling, wolfgang@pro-linux.de.

Permission is granted to copy, distribute and/or modify this document
under the terms of the GNU Free Documentation License, Version 1.1 or
any later version published by the Free Software Foundation; with no
Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
Texts. For details see http://www.gnu.org/copyleft/fdl.html.







Next: ,
Previous: Top,
Up: Top



1 About this document









Next: ,
Up: About this document



1.1 Conventions



The Version of this document follows the convention
<hurd version>_<document release>.
This means that 0.2_7 is the seventh release since Hurd 0.2 and a
version like 0.4_1 would be the first release for Hurd 0.4 (which, of
course, is not yet available :)).

$(HURD) means the top directory of the `Hurd' source tree.
$(GNUMACH) means the top directory of the `GNU Mach' source tree.
$(MIG) means the top directory of the `MIG' source tree.
$(GLIBC) means the top directory of the `GNU Libc' source tree.

Single shell commands start with a $ for user commands and a
# for root commands:

     $ diff -u libtrivfs.old/open.c libtrivfs/open.c

# reboot

Prompts of other programs look as they do in the respective
application. For example, the GDB prompt is indicated by
(gdb):

     (gdb) break trivfs_S_io_write


I will try to obey the GNU Coding Standards in my C examples.
http://www.gnu.org/prep/standards_toc.html






Next: ,
Previous: Conventions,
Up: About this document



1.2 Topic



This document is an introduction to GNU Hurd and Mach programming. The
purpose of this guide is to help interested people start hacking the
Hurd or extending it (by writing translators). It gives lots of
references to the Hurd or GNU Mach source files. It is recommended
that you read through some of these sources. Indeed the Hurd sources
are very well written and commented and you can learn a lot by reading
them.

The Hurd looks very complex and hard to learn — at a first glance. But
it isn't, because you don't need to understand everything at once, you
may do it slowly and step-by-step and can apply your existing
knowledge. There are also libraries that make hacking of certain
common kinds of translators easy. I think that the only problem is
the absence of nice documentation like the “Linux Module Programming
Guide” and such, which makes it possible to get into it step by
step. This document tries to fill that gap.

Mach and MIG are not handled in depth here, so if you want specific
information on them, I recommend reading the GNU Mach Reference Manual
1
and the documentation about MIG available on the internet.

The Hurd Hacking Guide is not intended to be a complete reference, but
it ought to help you getting started. The only real reference at the
time of writing is the Hurd source code.

The order of chapters in this document was originally partly based on
a mail from Farid Hajji
2,
which in turn seems to be based on $(HURD)/doc/navigating.





Previous: Topic,
Up: About this document



1.3 Feedback



Don't hesitate to send me improvements, corrections and
extensions. You are of course welcome to correct my lousy english —
I'm not a native speaker/writer. (Thanks to Alfred M. Szmidt for his
corrections, BTW.)

I would be happy to hear about how understandable this text is. Did
you “get” everything? Was some part confusing? Send me feedback!

Actually, I wrote this to document everything I learned about the
Hurd, so that I later could quickly lookup a detail that I had
forgotten. This means that if you send me extensions, I will also
profit from them.




2 Requirements




  1. You should know at least basic things about the Hurd
    3.

  2. More specific knowledge is of course welcome
    4.

  3. It would be good to know what a tranlators is
    5.

  4. The sources of Hurd and GNU Mach are also useful:

              mkdir ~/hurd-cvs
    
    cd ~/hurd-cvs/

    cvs -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/hurd \
    co hurd
    cvs -d:pserver:anonymous@cvs.savannah.gnu.org:/sources/hurd \
    co -r gnumach-1-branch gnumach

  5. A GNU/Hurd installation certainly will help you, too
    6.

  6. Diving into the header files of the Hurd's libraries is dangerous,
    because you can drown very easily, because you won't find the way out
    of all these data structures. Maybe a list of where to find what might be
    helpful, e.g. the output of

              $ cd $(HURD) && egrep '^struct [^;*]+$' */*.h
    

  7. If you know the principles of Mach, what MIG is, etc., then this will
    of course help you a lot, but it should not be necessary. Knowing
    about the Linux kernel might also help to some degree.

  8. Oh, and you should know the C programming language. :-)







Next: ,
Previous: Requirements,
Up: Top



3 A Short Overview of Hurd and Mach




“We're way ahead of you here. The Hurd has always been on the cutting edge of
not being good for anything.” (Roland McGrath7)



“In short: just say NO TO DRUGS, and maybe you won't end up like the Hurd
people.” (Linus Torvalds8)


Every seasonable piece of software needs a method for communication
between components. Nowadays, things like CORBA or Mozilla's XPCOM are
used for that. The advantage of the Hurd over other systems is, that
it provides such a facility and does not require existing
applications to be modified to take advantage of its communication
framework. How does the Hurd reach this goal?


“The Hurd, at its most central core, is just the protocols that the
cooperating servers use.” (Thomas
Bushnell9)


In a Mach environment, communication between programs is mostly done by sending
messages through so-called “ports”, which are a kind of message queue. For
each port, there is one task with receive-permission (i.e. this task receives
the messages someone sends to this port). Other tasks might have a
send-permission or a send-once-permission (which is used for getting a reply
from a server, because ports are one-way channels) for this port, or even no
permission at all.

If you read through $(GNUMACH)/include/mach/port.h, you may
notice that there are more port rights: A send or send-once right is
turned into a “dead name” if the receive right is destroyed. So you
can't use the dead name right for anything, it's merely a
place-holder. Another port right is “port set”, which Marcus
Brinkmann explains as follows 10:


“A port set is a set of ports. It is useful to combine ports into a
port set if you just want the next message on any of the ports you
have a receive right for. In the Hurd, we use port classes and buckets
provided by libports, though.

Well, we are not exactly strict in our wording when talking about
ports. [...] You can think of port rights as capabilities associated
with ports and port sets, if you want.”



How can a process find a specific port? It's really simple: through
the file system. For example, the services of an ext2-server are
available through the node where the file system he handles was
“mounted” (please note that there is no such thing as mounting in
the Hurd world, this is what it would be called under Unix; the
correct term for the GNU/Hurd operating system would be “setting a
translator”).

Your favourite email client doesn't support random signatures? Write
a random signature translator 11 (which returns a new signature each time you read
from it). And the best thing is: now all email clients may use this
feature! Do you see how the Hurd encourages code reuse? Do you see how
GNU/Hurd does not require programs to be modified to take advantage of
most of the nifty features it provides?

If you would like to learn more about the Hurd filesytem, I highly
recommend reading the presentation “The Hurd”
12.

We can say that the file system is the name space for services, this
also true in the other direction: the name-space for services is the
file system. This a very important thing to understand. While the file
system is the canonical way to get a port, there are other ways as
well; for example, you can get a port in a message.

If you are wondering why I compared this kind of communication with
CORBA, the following quote from the paper “Towards a New Strategy of
OS Design” might help you understand the reason:


“With translators, the filesystem can act as a rendezvous for
interfaces which are not similar to files. Consider a service which
implements some version of the X protocol, using Mach messages as an
underlying transport. For each X display, a file can be created with
the appropriate program as its translator. X clients would open that
file. At that point, few file operations would be useful (read and
write, for example, would be useless), but new operations
(XCreateWindow or XDrawText) might become meaningful. In this case,
the filesystem protocol is used only to manipulate characteristics of
the node used for the rendezvous. The node need not support I/O
operations, though it should reply to any such messages with a
message_not_understood return code.”





4 Basics of Mach and MIG







4.1 Mach ports



Now we will take a look at some Mach details. Yes, I know that you
would like to start writing translators as soon as possible, but you
really should know at least the basics of Mach. Mach ports are used
extensively throughout the Hurd, so let's start with them.

First, let's make the distinction between ports, port rights and port
names clear. Marcus Brinkmann once wrote (on IRC):


mach_port_t is a port name, the port name denotes an entry in the
tasks port name space, which is associated with either a dead name, a send-once
right, or a combination of a receive right and a send right with potentially
many user references.

Assume you have mach_port_t 5, and want to send a message to
it. To send a message, you pass the task mach_task_self(), the
port name, the msgid and the arguments. The task is used to get the
ipc name space, the port name is used to find the entry in this name
space. The entry tells Mach about the port rights you have for the
associated port with this port name.

You have a single port name for all receive/send rights you might have
for a port. But you have distinct port names for send once rights,
because this is easier for Mach to manage — and for user programs,
too.”



Mach defines the type natural_t, which is the native type of the
machine, e.g. 32 bits on a 32-bit processor. natural_t is always
unsigned, but there is a signed variant called integer_t. The
definitions for the i386 platform can be found in
$(GNUMACH)/i386/include/mach/i386/vm_types.h. In this file you
can find other interesting types as well, but those are not important
for us right know.

Like Unix file descriptors, Mach port names are plain, boring
integers. In the file $(GNUMACH)/include/mach/port.h you can
see the following definitions:

     typedef natural_t mach_port_t;

typedef mach_port_t *mach_port_array_t;

(For most data types in Mach, an additional *_array_t type is
defined.) The port number identifies a unique port (in the namespace
of the task), thus a mach_port_t value is often refered to as “port
name”.

A value of MACH_PORT_DEAD (i.e. ~0) represents a port right
that has died, while MACH_PORT_NULL (quoting port.h)
“indicates the absence of any port or port right.”

You may check with MACH_PORT_VALID (port) if a port is neither
of these two values.

For the port rights, we have the macros with the names
MACH_PORT_RIGHT_RECEIVE,
MACH_PORT_RIGHT_SEND,
MACH_PORT_RIGHT_SEND_ONCE,
MACH_PORT_RIGHT_PORT_SET,
MACH_PORT_RIGHT_DEAD_NAME and
MACH_PORT_RIGHT_NUMBER
which are of type mach_port_right_t:

     typedef natural_t mach_port_right_t;


This type is used whenever we need to act on a particular port
right. Often, however, we want to carry around a set of rights,
because we may have multiple rights on a single port. For this, we use
another type:

     typedef natural_t mach_port_type_t;

typedef mach_port_type_t *mach_port_type_array_t;

A mach_port_type_t variable may either carry the value
MACH_PORT_TYPE_NONE, which of course represents an empty set of
rights, or a set of the macros MACH_PORT_TYPE_SEND,
MACH_PORT_TYPE_RECEIVE etc. combined with a bitwise or. There
are also several predefined combinations:

Macro Combination

MACH_PORT_TYPE_SEND_RECEIVE MACH_PORT_TYPE_SEND,

MACH_PORT_TYPE_RECEIVE



MACH_PORT_TYPE_SEND_RIGHTS MACH_PORT_TYPE_SEND,

MACH_PORT_TYPE_SEND_ONCE



MACH_PORT_TYPE_PORT_RIGHTS MACH_PORT_TYPE_SEND_RIGHTS,

MACH_PORT_TYPE_RECEIVE



MACH_PORT_TYPE_PORT_OR_DEAD MACH_PORT_TYPE_PORT_RIGHTS,

MACH_PORT_TYPE_DEAD_NAME



MACH_PORT_TYPE_ALL_RIGHTS MACH_PORT_TYPE_PORT_OR_DEAD,

MACH_PORT_TYPE_PORT_SET



Don't confuse the MACH_PORT_RIGHT_* with the
MACH_PORT_TYPE_* macros. They have similar names, but different
meanings as well as different values.

More details on Mach IPC can be found in the GNU Mach Reference
Manual, Chapter 4 (“Inter Process Communication”)
13.








Next: ,
Previous: Mach ports,
Up: Basics of Mach and MIG



4.2 Threads and Tasks







4.3 MIG



Making RPCs (Remote Procedure Calls) by sending Mach messages is not
trivial. Making it easier is the purpose of MIG (Mach Interface
Generator). You must write an interface definition file, feed it into
MIG and then it outputs two C sources and a header file, which do the
Mach port magic for you. Then you can send messages with simple
function calls.

Of course, you only need to do that if you want to define your own
interfaces. The Hurd already contains various interfaces. The
apropriate functions are in glibc, thus you don't have to specify any
special flags if you want to use them.

The syntax of MIG files is similar to Pascal and should not be hard to
understand. We will talk about some details later.





5 The Hurd Interfaces



Understanding concepts is a very important thing, but this alone will
not enable you to get any work done. You also need to know about the
interfaces which make it possible to use those concepts.

You can find the interface definitions in the $(HURD)/hurd/*.defs files.
Important are io.defs, password.defs, fsys.defs,
fs.defs and auth.defs. The login.defs interface is nice,
but not implemented so far and maybe it will never be. The *_reply.defs
interfaces are — of course — for reply messages.

You definitively should take a look at the Hurd interfaces now. In the
next chapter, we will see how one uses those interfaces.




6 How does this look in practice?








6.1 Writing a file to standard output



Let's take a closer look at a program that dumps a file to its
standard output in a “hurdish” way. Please note that the non-Hurd
way may still be used. Glibc still provides functions like write
()
on GNU/Hurd systems. The Hurd generic parts of glibc are in
$(GLIBC)/hurd/, Mach dependent parts are in
$(GLIBC)/sysdeps/mach/hurd/.

     /* dump.c - Dump a file to stdout in a "hurdish" way.

*
* Copyright (C) 2001, 2002, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling <wolfgang@pro-linux.de>.
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/

#define _GNU_SOURCE 1

#include <hurd.h>
#include <hurd/io.h>

#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <error.h>

int
main (int argc, char *argv[])
{
file_t f;
mach_msg_type_number_t amount;
char *buf;
error_t err;

if (argc != 2)
error (1, 0, "Usage: %s <filename>", argv[0]);

/* Open file */
f = file_name_lookup (argv[1], O_READ, 0);
if (f == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);

/* Get size of file (buggy! See below) */
err = io_readable (f, &amount);
if (err)
error (1, err, "Could not get number of readable bytes");

/* Create buffer */
buf = malloc (amount + 1);
if (buf == NULL)
error (1, 0, "Out of memory");

/* Read */
err = io_read (f, &buf, &amount, -1, amount);
if (err)
error (1, errno, "Could not read from file %s", argv[1]);
buf[amount] = '\0';
mach_port_deallocate (mach_task_self (), f);

/* Output */
printf ("%s", buf);
return 0;
}

You may compile this with:

     $ gcc -g -o dump dump.c


But let's look at the interesting pieces of this program:

     #define _GNU_SOURCE 1


You should always define this macro for GNU/Hurd specific
programs. Otherwise, you will not even be able to compile them. Define
it before including any headers. Alternatively, you might want to
pass -D_GNU_SOURCE to gcc, by adding it to CPPFLAGS, for example.

     file_t f;


A file_t is actually a mach_port_t, but we use file_t to
make clear we use this to open a file.

     char *buf;


You may have noticed that we will use this buffer in a situation where we
should pass a data_t (which is typedef'ed as a char *), but
$(HURD)/hurd/hurd_types.h states about data_t and several other
types:


“These names exist only because of MIG deficiencies. You should not use them
in C source; use the normal C types instead.”


     error_t err;


There is also the Mach type kern_return_t, but in the Hurd, error_t is
the better choice. Marcus Brinkmann explains:


“kern_return_t is the mach error type from mach_msg for example, so
if you call an RPC, and want to do it in a Mach compatible fashion,
use kern_return_t. BUT on the Hurd, we use error_t, because that is
compatible with the glibc error types. You can always cast from
kern_return_t to error_t on GNU systems.”


     if (argc != 2)

error (1, 0, "Usage: %s <filename>", argv[0]);

Just in case you are not familiar with the error () function, I
will quote /include/error.h (Remember we don't use /usr in the
Hurd :-)):

     /* Print a message with `fprintf (stderr, FORMAT, ...)';

if ERRNUM is nonzero, follow it with ": " and strerror (ERRNUM).
If STATUS is nonzero, terminate the program with `exit (STATUS)'. */
extern void error (int status, int errnum, const char *format, ...);

Now the actual action starts. We try to open the file with the glibc
function file_name_lookup (). This function returns
MACH_PORT_NULL if the attempt failed. O_READ is a GNU extension
and is the same as the POSIX constant O_RDONLY. In the same way,
O_WRITE is identical to O_WRONLY.

     /* Open file */

f = file_name_lookup (argv[1], O_READ, 0);
if (f == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);

Of course we could read the file character-by-character, but in this
example we assume that it is a “normal” file and may be read at
once, so we use io_readable () to find out about the size of
the file (this won't work for files like /dev/random!
io_readable () only tells us how much data is available right
now). Then we allocate a buffer for the whole file:

     /* Get size of file */

err = io_readable (f, &amount);
if (err)
error (1, err, "Could not get number of readable bytes");

/* Create buffer */
buf = malloc (amount + 1);
if (buf == NULL)
error (1, 0, "Out of memory");

Now all we have to do is reading the file into the buffer. We put a
0-byte at the end, so we will be able to use the file content as a
string (we assume that the file does not contain any 0-bytes, which is
bad style, but in this case, we don't care).

     /* Read */

err = io_read (f, &buf, &amount, -1, amount);
if (err)
error (1, errno, "Could not read from file %s", argv[1]);
buf[amount] = '\0';

Note that we pass &buf, which is a pointer to a pointer. buf
itself might get modified, but this decision is up to the receiver of
the io_read message.

If you look at $(HURD)/hurd/io.defs, you will probably wonder
that io_read only has four arguments, while we passed five. io_object,
data, offset and amount are written down in the .defs file, while
/include/hurd/io.h has an additional argument (called dataCnt)
after data, which makes perfectly sense: we also need to get the
information which amount of data we actually got. Adding this argument
is done automatically by MIG.

We are finished using the file, so we can close it. This is done by
deallocating the port:

     mach_port_deallocate (mach_task_self (), f);


That's all.




6.2 Creating a copy of a file



Now we will try to copy a file. But this time, we will do it right: if
the translator that provides the file takes while to deliver new data,
we will wait that while. So we will read until we reach a real EOF. We
know that we reached the end of the file if our call to io_read
()
gives us zero bytes of data.

     /* copy.c - Copy a file in a "hurdish" way.

*
* Copyright (C) 2001, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling <wolfgang@pro-linux.de>.
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/

#define _GNU_SOURCE 1

#include <hurd.h>
#include <hurd/io.h>

#include <fcntl.h>
#include <stdio.h>
#include <errno.h>
#include <error.h>

#define BUFLEN 10 /* Arbitrary */

int
main (int argc, char *argv[])
{
file_t in, out;
mach_msg_type_number_t rd_amount, wr_amount;
char *buf, *ptr;
error_t err;

if (argc != 3)
error (1, 0, "Usage: %s <inputfile> <outputfile>", argv[0]);

/* Create buffer */
buf = malloc (BUFLEN + 1);
if (buf == NULL)
error (1, 0, "Out of memory");

/* Open files */
in = file_name_lookup (argv[1], O_READ, 0);
if (in == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[1]);
out = file_name_lookup (argv[2], O_WRITE | O_CREAT | O_TRUNC, 0640);
if (out == MACH_PORT_NULL)
error (1, errno, "Could not open %s", argv[2]);

/* Copy */
while (1)
{
/* Read */
err = io_read (in, &buf, &rd_amount, -1, BUFLEN);
if (err)
error (1, err, "Could not read from file %s", argv[1]);

if (rd_amount == 0)
break;

/* Write */
ptr = buf;
do
{
err = io_write (out, ptr, rd_amount, -1, &wr_amount);
if (err)
error (1, err, "Could not write to file %s", argv[2]);
rd_amount -= wr_amount;
ptr += wr_amount;
}
while (rd_amount);
}

mach_port_deallocate (mach_task_self (), in);
mach_port_deallocate (mach_task_self (), out);
return 0;
}

Interesting parts are:

     out = file_name_lookup (argv[2], O_WRITE | O_CREAT | O_TRUNC, 0640);


Here we open the output file and create it, if it does not exist, with
permissions being 0640 (which is of course `rw-r—–') minus the
umask.

     /* Read */

err = io_read (in, &buf, &rd_amount, -1, BUFLEN);
if (err)
error (1, err, "Could not read from file %s", argv[1]);

if (rd_amount == 0)
break;

As we said above: if we couldn't read any bytes, this indicates that
we reached the end of the file. It does not mean that there is no
data available at the moment: if we would read from /dev/random for
example, and there would be no data to read at the moment, this call
would not tell us that there is no data, but would block until new
data is available.

     /* Write */

ptr = buf;
do
{
err = io_write (out, ptr, rd_amount, -1, &wr_amount);
if (err)
error (1, err, "Could not write to file %s", argv[2]);
rd_amount -= wr_amount;
ptr += wr_amount;
}
while (rd_amount);

There is no guarantee that io_write () accepts all data we
attempt to feed into it immediately, so we might need to retry, but
only with the data that was not accepted until now.




6.3 Final notes




Finaly I would like to note that on the GNU/Hurd system there is no
reason not to use the nice non-standard extensions of GCC. For
example, nested functions are used frequently throughout the Hurd
sources. It might be helpful to know about these extensions, so you
should probably do

     $ info gcc "C Extensions"



A nice example for “hurdish” code is $(HURD)/init/init.c, so
you should also take a look at that. You maybe won't understand all of
it, but that doesn't matter. More sources you might want to read are
in $(HURD)/utils/ and $(HURD)/sutils/.




7 The Hurd Libraries (Overview)



There are several libraries which make writing translators of various
kinds easier.

libtrivfs is used for “trivial” translators. In this case, trivial
translators means all translators that provide only a single file
(node), as opposed to a complete directory or even a file system. As
the first translators any new Hurd hacker will develop are simple
single-file translators, this library is the first you should learn
about.

libnetfs is the library for complete file systems where the translator
does not directly control the underlying data, as is the case in
ftpfs, nfs and unionfs14, for example. libnetfs
will probably be renamed to libfsserver.

libdiskfs is also for complete filesystems, but it is used in the case
where the translator controls the underlying data. Examples are
ext2fs, UFS and tmpfs.

libtreefs is defunct. It was never finished. Nobody uses it. Neither
should you.


libports provides functions for working with ports. It can also be
seen as an abstraction of what functionality the Hurd expects from a
message-passing system.

libstore: the following explanation can be found in the libstore
header file: “A `store' is a fixed-size block of storage, which can
be read and perhaps written to. This library implements many different
backends which allow the abstract store interface to be used with
common types of storage — devices, files, memory, tasks, etc. It also
allows stores to be combined and filtered in various ways.”


libiohelp: “Library providing helper functions for io servers.”

libthreads is the cthreads library. This library comes from the Mach
microkernel and was developed before the POSIX threads standard
existed. We will have POSIX threads in the future, but currently this
library is used for multithreading.



libihash provides integer-keyed hash table functions.




libps: “Routines to gather and print process information.”

libshouldbeinlibc: Nomen est omen. :-)




8 An Example using trivfs








8.1 GNU/Linux and GNU/Hurd



Before we take a closer look at how to use trivfs, let's see how this
would be done on a GNU/Linux system. I won't explain the GNU/Linux
example in much detail, because it is not very important for us, but
since lots of people are familar with Linux (kernel) coding, it might
be helpful to compare how things are done on GNU/Linux as opposed to
GNU/Hurd.

Writing a Linux kernel module for a device file like /dev/one (which,
of course, gives you infinite ones if you read from it is easy in
theory. A module for kernel 2.4.x providing a special file might look
like this:

     /* linux-one.c - Linux kernel module for /dev/one.

*
* Copyright (C) 2000, 2001, 2007 Free Software Foundation, Inc.
*
* Written by Wolfgang Jährling <wolfgang@pro-linux.de>.
*
* Distributed under the terms of the GNU General Public License.
* This is distributed "as is". No warranty is provided at all.
*/

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/wrapper.h>
#include <asm/uaccess.h>

#define ONE_NAME "one"
#define ONE_MAJOR 100 /* Major device file number */

static int is_opened = 0;

/* Someone wants to open the file */
static int
device_open (struct inode *inode, struct file *file)
{
/* We allow only one simultaneous usage */
if (is_opened)
return -EBUSY;

is_opened = 1;
MOD_INC_USE_COUNT; /* Module can't be unloaded now */

return 0; /* Could be opened */
}

/* The file is closed again */
static int
device_release (struct inode *inode, struct file *file)
{
is_opened = 0;
MOD_DEC_USE_COUNT; /* Module may be unloaded now */

return 0;
}

/* Somebody wants to have lots of one's */
static ssize_t
device_read (struct file *file, char *buf, size_t len, loff_t *offset)
{
int i;
static char one = 1;

for (i = 0; i < len; i++)
if (copy_to_user (&buf[i], &one, 1))
return -EFAULT;

return len;
}

/* Now he/she wants to write something... */
static ssize_t
device_write (struct file *file, const char *buf, size_t len,
loff_t *offset)
{
/* ...but we don't care */
return len;
}

/* Let's put the supported operations in a structure */
struct file_operations one_operations =
{
NULL, /* Owner module... wonder what this means :-) */
NULL, /* seek */
device_read,
device_write,
NULL, /* readdir */
NULL, /* poll */
NULL, /* ioctl */
NULL, /* mmap */
device_open
NULL, /* flush */
device_release,
NULL, NULL, NULL, NULL, NULL /* Some others */
};

/* This is automatically called when the module is loaded */
int
init_module (void)
{
int result = register_chrdev (ONE_MAJOR, ONE_NAME, &one_operations);

if (result < 0) /* Could not register character device */
{
printk (KERN_ERR "Couldn't register device: %d.\n", result);
return result;
}
printk (KERN_INFO "Loading the %s module.\n", ONE_NAME);
return 0;
}

/* This gets called when unloading the module */
void
cleanup_module (void)
{
int result = unregister_chrdev (ONE_MAJOR, ONE_NAME);

if (result < 0)
printk (KERN_ERR "Couldn't unregister device: %d.\n", result);
else
printk (KERN_INFO "Unloading the %s module.\n", ONE_NAME);
}

We simply implement the usual operations like read (),
close () and open (), put pointers to these functions in
a struct and register this as a character device.

In the next step, we would compile this into an object file and load
it as root with insmod(8) and create the device file with

     # mknod /dev/one c 100 1


This is simple to understand — but hard to put into practice, for (at
least) four reasons: first, a small mistake in the code might cause a
kernel panic; second, you need root privileges to do this at all;
third, you can't use functions provided by the GNU C library, let
alone other helpful libraries like the GLib; fourth, you need an
unused device number (I used 100 above and hoped that nobody else used
that before).

With the Hurd, things work different though. Of course, the
superstructure looks quite different, but also (and more importantly)
the environment of the code is much more friendly: it's the `normal'
user space we all know and love. This effectively means that you can
develop a translator almost like any other program. The only
disadvantage is that the programming interface is a bit more complex
than the one of the Linux kernel. But if you understood the GNU/Linux
example above, you won't have problems with the following translator,
which implements the same functionality.

In fact, it provides much more functionality, because libtrivfs forces
us to implement more; when writing a real world translator, you should
also do option parsing, because sometimes users only ask for
`--help', but I tried to keep this example translator as simple as
possible. When writing programs for the GNU system, we recommend
parsing options with the argp functions15.




8.2 Implementing trivfs callback functions



In the GNU/Hurd system, the usual Unix system calls are provided by
the GNU C Library. The GNU C Library wrapps them to messages and sends
them to the respective ports. This means that you won't write direct
implementations for functions like read (), but one needs
functions with the arguments of, say, io_read (). When using
the trivfs library, we have to implement routines with slightly
different arguments. For example, the arguments of
trivfs_S_io_read (), as the name of such a function would be
when using libtrivfs, are:

Type Name Description

struct trivfs_protid *
cred
Credentials

mach_port_t
reply
The port where the reply will be sent

mach_msg_type_name_t
reply_type
The rights we have on the above port

vm_address_t *
data
Pointer to the place where you should write you reply data to

mach_msg_type_number_t
data_len
Here you should store, how much data you actually return. Initialy, this
is set to the size of the already available memory at *data.

off_t
offs
Seek a position. If offs is -1, use the internal file pointer. Ignore it
if the object is not seekable.

mach_msg_type_number_t
amount
How much data you should write



The trivfs_S_io_read () function of the “Hello, world”
translator (see $(HURD)/trans/hello.c) is a nice example for
how to implement such a function. The implementation of our “one”
translator will be a less complete example.

This is how our function looks like:

     error_t

trivfs_S_io_read (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t reply_type,
vm_address_t *data, mach_msg_type_number_t *data_len,
off_t offs, mach_msg_type_number_t amount)
{
/* Deny access if they have bad credentials. */
if (!cred)
return EOPNOTSUPP;
else if (! (cred->po->openmodes & O_READ))
return EBADF;

if (amount > 0)
{
int i;

/* Possibly allocate a new buffer. */
if (*data_len < amount)
*data = (vm_address_t) mmap (0, amount, PROT_READ|PROT_WRITE,
MAP_ANON, 0, 0);

/* Copy the constant data into the buffer. */
for (i = 0; i < amount; i++)
((char *) *data)[i] = 1;
}

*data_len = amount;
return 0;
}

This is the most complex callback function of our translator. The
others are much simpler.

You should always return EOPNOTSUPP (Operation not supported) if
cred is faulty and EBADF (Bad file descriptor) if the necessary bit
is not set in the open mode.

If the user wants to read more bytes (the number in amount) than the
buffer can hold, we have to allocate some more memory. Do you remember
that we had to pass a pointer to the pointer to our buffer, when
calling io_read ()? This was done exactly for this reason. We
are allocating the memory with mmap () here, because we want a
page aligned memory block. Now we made sure that we've got enough
space where we can write the data into. This means we can begin
filling in the ones. Note that *data is a vm_address_t and we have
to cast it into a pointer before we can use it as such.

If you understand the above function, I doubt you will have troubles
with the following write routine, which does almost nothing:

     kern_return_t

trivfs_S_io_write (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
vm_address_t data, mach_msg_type_number_t datalen,
off_t offs, mach_msg_type_number_t *amout)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_WRITE))
return EBADF;
*amout = datalen;
return 0;
}

Apart from the usual error checking it only claims that all data the
user of our translator (i.e. the program which opened the file our
translator implements) wanted to write was successfully written —
which makes perfect sense, because we ignore all data someone writes
into our file.

There are several callbacks we will implement in a similar way like
the write function. You can find these functions in the complete
source code of our translator.

Another callback routine is trivfs_S_io_readable (). It will be
called if somebody wants to know how much data we can deliver
immediately. Of course we can provide an infinite number of bytes
directly. And as Marcus Brinkmann told me 16:


“If you can deliver an unlimited number of bytes without blocking, I
think the highest possible value that fits in mach_msg_type_number_t
seems to be appropriate. (I hope applications can deal with that).”


Well, if something can go wrong, it will go wrong. For example, our
example program dump.c above would handle such a situation in a very
ungraceful way. This is why I wrote the following implementation,
which is a bit paranoid and does not cause problems if an application
is unable to handle a huge value in a sane way:

     kern_return_t

trivfs_S_io_readable (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
mach_msg_type_number_t *amount)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EINVAL;
else
*amount = 10240; /* Dummy value: 10k */
return 0;
}

The last interesting callback function is trivfs_S_io_select
()
, which is well commented in the complete source bellow.




8.3 Other trivfs callbacks



We have to do some more work before we have a complete trivfs
translator: we must define some symbols that hold general information
about our translator.

     /* Trivfs hooks.  */

int trivfs_fstype = FSTYPE_MISC; /* Generic trivfs server */
int trivfs_fsid = 0; /* Should always be 0 on startup */

In most cases, you might want to set trivfs_fstype to
FSTYPE_MISC. Other possible values are (descriptions from
$(HURD)/hurd/hurd_types.h):


  1. FSTYPE_IFSOCK: PF_LOCAL socket naming point
  2. FSTYPE_DEV: GNU Special file server
  3. FSTYPE_TERM: GNU Terminal driver


In trivfs_allow_open, you specify the initial permissions for
your translator:


     int trivfs_allow_open = O_READ | O_WRITE;


And with the following three variables, you specify what kinds of
accesses are actually implemented:

     /* Actual supported modes: */

int trivfs_support_read = 1;
int trivfs_support_write = 1;
int trivfs_support_exec = 0;




8.4 The main function



Translators are normal programs, and as such, they need a main
()
function. The trivfs library does not define such a function, so
we have to do this on our own. Our program may be started as a normal
program or as a translator, so we have to distinguish between these
two cases. This can be done by testing if our bootstrap port is
MACH_PORT_NULL. If it is, the program was not started as a
translator. In most cases, a translator might simply abort in this
case. If, however, our bootstrap port is not MACH_PORT_NULL, we should
initialize libtrivfs and deallocate the bootstrap port. In the last
step, we will launch the translator. Our main () function
(which does not process any command line arguments) looks like this:

     int

main (void)
{
error_t err;
mach_port_t bootstrap;
struct trivfs_control *fsys;

task_get_bootstrap_port (mach_task_self (), &bootstrap);
if (bootstrap == MACH_PORT_NULL)
error (1, 0, "Must be started as a translator");

/* Reply to our parent */
err = trivfs_startup (bootstrap, 0, 0, 0, 0, 0, &fsys);
mach_port_deallocate (mach_task_self (), bootstrap);
if (err)
error (1, err, "trivfs_startup failed");

/* Launch. */
ports_manage_port_operations_one_thread (fsys->pi.bucket,
trivfs_demuxer, 0);

return 0;
}

You may have wondered why our functions had such disgusting names like
trivfs_S_io_read (). At least by now you should know the
answer: We don't need to register our functions anywhere, we only have
to give them the apropriate names and libtrivfs will do the rest for
us. Of course, the function names actually have a meaning, as Marcus
Brinkmann explains:


“The io_read is the name of the RPC as in the .defs file. The S_
prefix means it is the _S_erver stub implemented here, rather than the
message packaging/unpacking functions. The trivfs_ prefix means that
this is not the bare RPC, but the mach_port_t is actually converted to
a credential struct cred (or so). This is done at the INTRAN.

Usually, you get the mach_port_t as first argument, but in libtrivfs
stubs, you get a different. Grep for intran in
$(HURD)/libtrivfs/* and you will see how ports are mapped to
credentials.”






8.5 The complete source



Okay, that's all, folks. I left out some unimportant details, which
you may look up in the following complete listing of hurd-one.c. You
can compile this file with

     $ gcc -g -o one hurd-one.c -ltrivfs -lfshelp


Other sources you might want to look at are
$(HURD)/trans/hello.c and $(HURD)/trans/null.c.

     /* hurd-one.c - A trivial single-file translator


Copyright (C) 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2007
Free Software Foundation, Inc.

Written by Wolfgang Jährling <wolfgang@pro-linux.de>, 2001

This is based on hurd/trans/hello.c. The hello.c source says:
Copyright (C) 1998, 1999, 2001 Free Software Foundation, Inc.
Gordon Matzigkeit <gord@fig.org>, 1999
It also uses parts of hurd/trans/null.c. The null.c source says:
Copyright (C) 1995,96,97,98,99,2001 Free Software Foundation, Inc.
Written by Miles Bader <miles@gnu.org>

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License as
published by the Free Software Foundation; either version 2, or (at
your option) any later version.

This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330,
Boston, MA 02111-1307 USA */

#define _GNU_SOURCE 1

#include <hurd/trivfs.h>

#include <stdlib.h> /* exit () */
#include <error.h> /* Error numers */
#include <fcntl.h> /* O_READ etc. */
#include <sys/mman.h> /* MAP_ANON etc. */

/* Trivfs hooks. */
int trivfs_fstype = FSTYPE_MISC; /* Generic trivfs server */
int trivfs_fsid = 0; /* Should always be 0 on startup */

int trivfs_allow_open = O_READ | O_WRITE;

/* Actual supported modes: */
int trivfs_support_read = 1;
int trivfs_support_write = 1;
int trivfs_support_exec = 0;

/* May do nothing... */
void
trivfs_modify_stat (struct trivfs_protid *cred, struct stat *st)
{
/* .. and we do nothing */
}

error_t
trivfs_goaway (struct trivfs_control *cntl, int flags)
{
exit (EXIT_SUCCESS);
}

error_t
trivfs_S_io_read (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t reply_type,
vm_address_t *data, mach_msg_type_number_t *data_len,
off_t offs, mach_msg_type_number_t amount)
{
/* Deny access if they have bad credentials. */
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EBADF;

if (amount > 0)
{
int i;

/* Possibly allocate a new buffer. */
if (*data_len < amount)
*data = (vm_address_t) mmap (0, amount, PROT_READ|PROT_WRITE,
MAP_ANON, 0, 0);

/* Copy the constant data into the buffer. */
for (i = 0; i < amount; i++)
((char *) *data)[i] = 1;
}

*data_len = amount;
return 0;
}

kern_return_t
trivfs_S_io_write (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
vm_address_t data, mach_msg_type_number_t datalen,
off_t offs, mach_msg_type_number_t *amout)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_WRITE))
return EBADF;
*amout = datalen;
return 0;
}

/* Tell how much data can be read from the object without blocking for
a "long time" (this should be the same meaning of "long time" used
by the nonblocking flag. */
kern_return_t
trivfs_S_io_readable (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
mach_msg_type_number_t *amount)
{
if (!cred)
return EOPNOTSUPP;
else if (!(cred->po->openmodes & O_READ))
return EINVAL;
else
*amount = 10000; /* Dummy value */
return 0;
}

/* Truncate file. */
kern_return_t
trivfs_S_file_set_size (struct trivfs_protid *cred, off_t size)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}

/* Change current read/write offset */
error_t
trivfs_S_io_seek (struct trivfs_protid *cred, mach_port_t reply,
mach_msg_type_name_t reply_type, off_t offs, int whence,
off_t *new_offs)
{
if (! cred)
return EOPNOTSUPP;
else
return 0;
}

/* SELECT_TYPE is the bitwise OR of SELECT_READ, SELECT_WRITE, and
SELECT_URG. Block until one of the indicated types of i/o can be
done "quickly", and return the types that are then available.
TAG is returned as passed; it is just for the convenience of the
user in matching up reply messages with specific requests sent. */
kern_return_t
trivfs_S_io_select (struct trivfs_protid *cred,
mach_port_t reply, mach_msg_type_name_t replytype,
int *type, int *tag)
{
if (!cred)
return EOPNOTSUPP;
else
if (((*type & SELECT_READ) && !(cred->po->openmodes & O_READ))
|| ((*type & SELECT_WRITE) && !(cred->po->openmodes & O_WRITE)))
return EBADF;
else
*type &= ~SELECT_URG;
return 0;
}

/* Well, we have to define these four functions, so here we go: */

kern_return_t
trivfs_S_io_get_openmodes (struct trivfs_protid *cred, mach_port_t reply,
mach_msg_type_name_t replytype, int *bits)
{
if (!cred)
return EOPNOTSUPP;
else
{
*bits = cred->po->openmodes;
return 0;
}
}

error_t
trivfs_S_io_set_all_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int mode)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}

kern_return_t
trivfs_S_io_set_some_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int bits)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}

kern_return_t
trivfs_S_io_clear_some_openmodes (struct trivfs_protid *cred,
mach_port_t reply,
mach_msg_type_name_t replytype,
int bits)
{
if (!cred)
return EOPNOTSUPP;
else
return 0;
}

int
main (void)
{
error_t err;
mach_port_t bootstrap;
struct trivfs_control *fsys;

task_get_bootstrap_port (mach_task_self (), &bootstrap);
if (bootstrap == MACH_PORT_NULL)
error (1, 0, "Must be started as a translator");

/* Reply to our parent */
err = trivfs_startup (bootstrap, 0, 0, 0, 0, 0, &fsys);
mach_port_deallocate (mach_task_self (), bootstrap);
if (err)
error (1, err, "trivfs_startup failed");

/* Launch. */
ports_manage_port_operations_one_thread (fsys->pi.bucket,
trivfs_demuxer, 0);

return 0;
}




9 Debugging a translator



This chapter requires you to know how to use GDB, the GNU Debugger. If
you did not use GDB before, I recommend reading the sample session
chapter in the GDB Texinfo documentation. If info and the
documentation are installed on your system, simply do

     $ info gdb "Sample Session"


Ok, so now how does one debug a translator? It's pretty obvious, but I
will explain it anyway. :-) The easiest way is to start your program
as an active translator:

     $ gcc -g -o one one.c -ltrivfs -lfshelp

$ settrans -ac foo one

Now the translator is up and running. You can see it in the process
list:

     $ ps Aux


(We don't have POSIX `ps' at the moment, so `ps aux' won't work,
sorry.) Now we need to attach to the running (respective waiting)
process. For example, if the PID was 357, we would do:

     $ gdb one 357


At the gdb prompt, we can now set breakpoints, then let the translator
continue:

     (gdb) break trivfs_S_io_read

(gdb) c

Now, you should switch to another screen window, xterm or
similar. Enter a command like

     $ cat foo


there and switch back to the terminal where you are running GDB. You
will see that it stopped at the breakpoint. Now you can debug as
usual. That's easy, isn't it?

If you are done, enter

     (gdb) quit


and say that you want to detach the process. We can conclude by saying that
you don't need any special technique for debugging a translator — at least
such a simple one.

At this point, you probably have an idea about how to develop Hurd
servers (translators). Often, you will need more than libtrivfs
provides. For outdated information on other libraries, see the Hurd
Reference Manual (`info hurd'), also available in
$(HURD)/doc/hurd.texi, for up-to-date information read the
appropriate header files.





10 Comprehensive trivfs example



TODO: Maybe a `cat' translator?




11 An example using netfs



TODO




12 An example using diskfs



TODO






Next: ,
Previous: An example using diskfs,
Up: Top



13 Frequently Asked Questions



Q: How can a translator access it's underlying node? A gzip translator
must do this, for example.

A: The underlying node is returned by fsys_startup(). See
$(HURD)/hurd/fsys.defs.

Q: Can one stack translators?

A: Yes, stacking active translators is possible, but you can't do it
with passive translators.

Q: What is a `protid'?

A: `prot' means protection. Every protid structure denotes a unique
user (i.e. client) of our translator. You can access per-open
information via the `po' field of the structure.

Q: Which kind of threads should I use if I want to write a program
that will run on both GNU/Linux and GNU/Hurd?

A: Use pthreads. We will have pthreads eventually.

Q: How can we claim to be POSIX compliant without having pthreads?

A: First of all, pthreads are optional. Second, we do have pthreads,
for example GNU Portable Threads (pth) does provide a non-preemtive
pthreads emulation. This seems to be standard compliant, but of course
it's not that useful, as most programs assume preemtive
multi-threading. Jeroen Dekkers is working on “real” pthreads, which
partially work, as of now.

Q: In the GNU Manifesto, RMS wrote that both C and LISP will be system
languages. What about that?

A: The Scheme interpreter Guile is part of the GNU project. As of now,
it provides only POSIX functionality, but there's no reason why nobody
should add GNU specific stuff. Adding support for GNU functionality to
various languages would be nice indeed. That's certainly not an urgent
issue, however.

Q: Why should I learn about Mach if the Hurd switches to L4 soon?

A: As of now, the Hurd uses Mach and you need to know Mach basics to
do Hurd work. Maybe the Hurd will run on L4 in the future, but
currently it's very, very far away from doing so.






Previous: Frequently Asked Questions,
Up: Top



14 Appendices










Up: Appendices



14.1 Stuff to do



This document is a work in progress. There are several things that
should be added. If you want to help, please contact me.

     - Use consistent formating.  often, @code{} should be used but isn't

- Correct the remaining FIXMEs (ok, this one was obvious)
- Take OSKit-Mach into account
- Add Moritz' Mach device access example and link to mailing list
archive with Daniel Wagners Mach device code
- Write about netfs and diskfs, add a longer trivfs example
- Describe more Mach and esp. MIG details





Footnotes

[1] http://www.gnu.org/software/hurd/gnumach-doc/mach.html



[2] http://lists.debian.org/debian-hurd-0012/msg00149.html



[3] http://hurd.gnu.org/



[4] http://www.gnu.org/software/hurd/hurd-paper.html



[5] http://www.debian.org/ports/hurd/hurd-doc-translator



[6] http://www.gnu.org/software/hurd/devel.html#machinery



[7] REFERENCE missing



[8] REFERENCE missing



[9] http://lists.debian.org/debian-hurd/2002/05/msg00493.html



[10] see
http://lists.gnu.org/archive/html/help-hurd/2001-07/msg00018.html
for the complete discussion



[11] Or better use the already
existing `run' translator, which was written my Marcus Brinkmann and
is much more flexible; using the existing filemux translator would
also be possible



[12] http://www.gnu.org/software/hurd/hurd-talk.html



[13] http://www.gnu.org/software/hurd/gnumach-doc/Inter-Process-Communication.html



[14] unionfs is not yet part of the Hurd,
but a partly working implementation exists.



[15] see "info libc
Argp"



[16] for the complete
mail see
http://lists.gnu.org/archive/html/help-hurd/2001-08/msg00003.html






By Akram on 4:00:00 PM

comments (0)

Filed Under:

Web Services Hacking


There are many ways to attack Web Services. This tutorial outlines some of the basic ways that Web services hacking can damage an organization's data, applications and ability to function. Below is just a partial list of Web Services attacks that are possible against XML Web Services.



There are many ways to classify these attacks:




  • XML-Based Attacks: Taking advantage of the way XML works. For instance, an XML document can be sent that causes a large entity expansion, tying up system resources.

  • Bugs in Back End Systems: Many technologies are used in the XML message stream and can include XML parsers, application servers, operating systems, databases, etc. XML can encapsulate malware that can take advantage of bugs in these systems.

  • Code Injection Attacks: Attack code can be sent via a SOAP message to be later executed in a receiving application. For instance, SQL injection or cross-site scripting attacks are relatively easy to create.

  • Content-Based Attacks: Viruses, overly long strings, large messages, malformed messages are examples of attacks that can cause unexpected behavior at the receiving application.

  • Denial of Service: A flood of messages, or a message with hundreds of encrypted elements may cause systems resources to be tied up and service levels to be affected.

  • Man in the Middle Attack: Messages can be intercepted to cause routing problems or integrity problems. This can cause a receiving application to be disrupted



SQL Injection/XPATH/XQUERY Attacks


Code injection attacks are relatively straightforward and usually require some knowledge of what the back end system is behind the interface. Many Web Services provide query-able information and have a SQL database in the backend. A Web Service can be quite easily compromised by sending code fragments within the envelope of Web Services. When the code fragment is unwrapped and sent to the database, special characters may cause unintended SQL, XPATH and XQUERY statements to be executed. This can cause access to systems without authorization, or access to information that was not intended to be seen. More malicious forms of injection attacks can cause unwanted commands or code to be run such as to delete an entire database table.



Web Services Hacking I: A password table is compromised by simply resolving the authentication string to always be TRUE.


Web Services Hacking I: A password table is compromised by simply resolving the authentication string to always be TRUE.



This situation enables simple authentication to the system. Other SQL Injection statements can cause unauthorized access to information or to simply delete the entire table.



Weak Password Attack


Enforcing strong password policies is common in many organizations and is often a regulatory requirement. Regardless of policy, it is also common that administrators pick weak passwords. This can cause access to systems using trial by error or brute force dictionary password attacks.



Web Services Hacking II: Weak password enforcement policies can result in weak passwords being chosen providing attackers an easier way to access systems.


Web Services Hacking II: Weak password enforcement policies can result in weak passwords being chosen providing attackers an easier way to access systems.



WSDL Enumeration


Web Services is a self-describing set of standards which allows access to significant amounts of meta information to aid seamless communication. This also means that there is a lot of information available to attackers of Web Service systems. In this example, the WSDL file contains significant information as to where a particular service is, what types of functions are callable within the Web Service and how to interact with such a service. The WSDL is essentially an advertising mechanism that can reveal information such as a sensitive service or an important parameter. WSDL may also reveal what tools generated the Web Service providing attackers with more information on the environment.


Web Services Hacking III: The WSDL reveals several callable operations, most notably GetQuote and TradeStock.


Web Services Hacking III: The WSDL reveals several callable operations, most notably GetQuote and TradeStock.



In this situation, you may wish everybody to have access to GetQuote but only a certain subset of requestors who are authorized TradeStocks. Even with authentication and access control, the WSDL may reveal information about TradeStock than is desirable.



Routing Detours


Routing Detours are a form of a "Man in the Middle" attack which compromises routing information. Intermediaries can be "hijacked" to rout sensitive messages to an outside location. Routing information (whether in the HTTP headers or in WS-Routing headers) can be modified en route . Traces of the routing can be removed from the message so that the receiving application does not realize that a routing detour has occurred.


Web Services Hacking IV: An intermediary is compromised which modifies WS-Routing headers to send sensitive information to an outside server. The information is either routed back to the intermediary or to the Web Service with all traces removed.


Web Services Hacking IV: An intermediary is compromised which modifies WS-Routing headers to send sensitive information to an outside server. The information is either routed back to the intermediary or to the Web Service with all traces removed.



Malicious Morphing


Malicious morphing is another form of "Man in the Middle" attack. Data, security information can be modified en route by an attacker resulting in data integrity issues and operational problems.


Web Services Hacking V: A compromised intermediary may modify the destination address of a purchase order or modify funds balance of a transaction to affect the data integrity of a back end system.



Cross-Site Scripting


SOAP and XML are standards used to wrap data for easy consumption. SOAP provides enveloping information to deliver messages in a seamless fashion between heterogeneous applications. XML includes metadata to describe the structure of the information. Malicious code can be embedded into the elements or CDATA of the information. CDATA is used to delineate information in the message that should not be parsed. Embedded characters or malicious code can be sent. The receiving application may display or execute the data in unintended ways. Cross-site scripting (sometimes called XML encapsulation) can be used to embed commands that can tie up system resources or gain unauthorized access.


Web Services Hacking VI: Illegal javascript code is injected into a message using CDATA. The field value, which eventually is displayed in a browser, actually runs javascript code on a browser causing an infinite loop


Web Services Hacking VI: Illegal javascript code is injected into a message using CDATA. The field value, which eventually is displayed in a browser, actually runs javascript code on a browser causing an infinite loop



XML-based Attacks


Sometimes called "Coercive Parsing", XML-based attacks take advantage of the XML parsers that process the SOAP message. Web Services and existing infrastructure do not provide protection for XML-based attacks. Putting in recursive relationships to create entity expansions, bogus parameters and significant amounts of whitespace can cause XML parsers to be overloaded or to perform unexpected problems. A recent Oracle Application Server bug for instance allowed for DTD references in a SOAP message which the standard does not allow. This would enable circular DTD references to be made causing resources to be tied up.


Web Services Hacking VII: Malicious content can be sent taking advantage of deficiencies in XML parsers.



Discovering and Eliminating Threats


In cases where malicious content is propagated on the services network, a Web services management solution – such as Actional provides – is ideal for discovering and eliminating such "rogue" services.



For More Information


Discover how a SOA management solution from Actional can be the answer to securing your Web services from hackers: download the free webinar, SOA Runtime Governance




Instant Hacking

By Akram on 3:53:00 PM

comments (0)

Filed Under:

[If you like this tutorial, please check out my book Beginning Python, or perhaps become my fan on Facebook?]

This is a short introduction to the art of programming, with examples written in the programming language Python. (If you already know how to program, but want a short intro to Python, you may want to check out my article Instant Python.) This article has been translated into Italian, Polish, Japanese, Serbian, Brazilian Portuguese, and Dutch, and is in the process of being translated into Korean.

This page is not about breaking into other people’s computer systems etc. I’m not into that sort of thing, so please don’t email me about it.

Note: To get the examples working properly, write the programs in a text file and then run that with the interpreter; do not try to run them directly in the interactive interpreter — not all of them will work. (Please don’t ask me on details on this. Check the documentation or send an email to help@python.org).
The Environment

To program in Python, you must have an interpreter installed. It exists for most platforms (including Macintosh, Unix and Windows). More information about this can be found on the Python web site. You also should have a text editor (like emacs, notepad or something similar).
What is Programming?

Programming a computer means giving it a set of instructions telling it what to do. A computer program in many ways resembles recipes, like the ones we use for cooking. For example [1]:
Fiesta SPAM Salad

Ingredients:

Marinade:
1/4 cup lime juice
1/4 cup low-sodium soy sauce
1/4 cup water
1 tablespoon vegetable oil
3/4 teaspoon cumin
1/2 teaspoon oregano
1/4 teaspoon hot pepper sauce
2 cloves garlic, minced

Salad:
1 (12-ounce) can SPAM Less Sodium luncheon meat,
cut into strips
1 onion, sliced
1 bell pepper, cut in strips
Lettuce
12 cherry tomatoes, halved

Instructions:

In jar with tight-fitting lid, combine all marinade ingredients;
shake well. Place SPAM strips in plastic bag. Pour marinade
over SPAM. Seal bag; marinate 30 minutes in refrigerator.
Remove SPAM from bag; reserve 2 tablespoons marinade. Heat
reserved marinade in large skillet. Add SPAM, onion, and
green pepper. Cook 3 to 4 minutes or until SPAM is heated.
Line 4 individual salad plates with lettuce. Spoon hot salad
mixture over lettuce. Garnish with tomato halves. Serves 4.

Of course, no computer would understand this… And most computers wouldn’t be able to make a salad even if they did understand the recipe. So what do we have to do to make this more computer-friendly? Well — basically two things. We have to (1) talk in a way that the computer can understand, and (2) talk about things that it can do something with.

The first point means that we have to use a language — a programming language that we have an interpreter program for, and the second point means that we can’t expect the computer to make a salad — but we can expect it to add numbers, write things to the screen etc.
Hello…

There’s a tradition in programming tutorials to always begin with a program that prints “Hello, world!” to the screen. In Python, this is quite simple:
print “Hello, world!”

This is basically like the recipe above (although it is much shorter!). It tells the computer what to do: To print “Hello, world!”. Piece of cake. What if we would want it to do more stuff?
print “Hello, world!”
print “Goodbye, world!”

Not much harder, was it? And not really very interesting… We want to be able to do something with the ingredients, just like in the spam salad. Well — what ingredients do we have? For one thing, we have strings of text, like “Hello, world!”, but we also have numbers. Say we wanted the computer to calculate the area of a rectangle for us. Then we could give it the following little recipe:
# The Area of a Rectangle

# Ingredients:

width = 20
height = 30

# Instructions:

area = width*height
print area

You can probably see the similarity (albeit slight) to the spam salad recipe. But how does it work? First of all, the lines beginning with # are called comments and are actually ignored by the computer. However, inserting small explanations like this can be important in making your programs more readable to humans.

Now, the lines that look like foo = bar are called assignments. In the case of width = 20 we tell the computer that the width should be 20 from this point on. What does it mean that “the width is 20”? It means that a variable by the name “width” is created (or if it already exists, it is reused) and given the value 20. So, when we use the variable later, the computer knows its value. Thus,
width*height

is essentially the same as
20*30

which is calculated to be 600, which is then assigned to the variable by the name “area”. The final statement of the program prints out the value of the variable “area”, so what you see when you run this program is simply
600

Note: In some languages you have to tell the computer which variables you need at the beginning of the program (like the ingredients of the salad) — Python is smart enough to figure this out as it goes along.
Feedback

OK. Now you can perform simple, and even quite advanced calculations. For instance, you might want to make a program to calculate the area of a circle instead of a rectangle:
radius = 30

print radius*radius*3.14

However, this is not significantly more interesting than the rectangle program. At least not in my opinion. It is somewhat inflexible. What if the circle we were looking at had a radius of 31? How would the computer know? It’s a bit like the part of the salad recipe that says: “Cook 3 to 4 minutes or until SPAM is heated.” To know when it is cooked, we have to check. We need feedback, or input. How does the computer know the radius of our circle? It too needs input… What we can do is to tell it to check the radius:
radius = input(“What is the radius?”)

print radius*radius*3.14

Now things are getting snazzy… input is something called a function. (You’ll learn to create your own in a while. input is a function that is built into the Python language.) Simply writing
input

won’t do much… You have to put a pair of parantheses at the end of it. So input() would work — it would simply wait for the user to enter the radius. The version above is perhaps a bit more user-friendly, though, since it prints out a question first. When we put something like the question-string “What is the radius?” between the parentheses of a function call it is called passing a parameter to the function. The thing (or things) in the parentheses is (or are) the parameter(s). In this case we pass a question as a parameter so that input knows what to print out before getting the answer from the user.

But how does the answer get to the radius variable? The function input, when called, returns a value (like many other functions). You don’t have to use this value, but in our case, we want to. So, the following two statements have very different meanings:
foo = input

bar = input()

foo now contains the input function itself (so it can actually be used like foo(“What is your age?”); this is called a dynamic function call) while bar contains whatever is typed in by the user.
Flow

Now we can write programs that perform simple actions (arithmetic and printing) and that can receive input from the user. This is useful, but we are still limited to so-called sequential execution of the commands, that is — they have to be executed in a fixed order. Most of the spam salad recipe is sequential or linear like that. But what if we wanted to tell the computer how to check on the cooked spam? If it is heated, then it should be removed from the oven — otherwise, it should be cooked for another minute or so. How do we express that?

What we want to do, is to control the flow of the program. It can go in two directions — either take out the spam, or leave it in the oven. We can choose, and the condition is whether or not it is properly heated. This is called conditional execution. We can do it like this:
temperature = input(“What is the temperature of the spam?”)

if temperature > 50:
print “The salad is properly cooked.”
else:
print “Cook the salad some more.”

The meaning of this should be obvious: If the temperature is higher than 50 (centigrades), then print out a message telling the user that it is properly cooked, otherwise, tell the user to cook the salad some more.

Note: The indentation is important in Python. Blocks in conditional execution (and loops and function definitions — see below) must be indented (and indented by the same amount of whitespace; a tab counts as 8 spaces) so that the interpreter can tell where they begin and end. It also makes the program more readable to humans.

Let’s return to our area calculations. Can you see what this program does?
# Area calculation program

print “Welcome to the Area calculation program”
print “–––––––––––––”
print

# Print out the menu:
print “Please select a shape:”
print “1 Rectangle”
print “2 Circle”

# Get the user’s choice:
shape = input(“> “)

# Calculate the area:
if shape == 1:
height = input(“Please enter the height: “)
width = input(“Please enter the width: “)
area = height*width
print “The area is”, area
else:
radius = input(“Please enter the radius: “)
area = 3.14*(radius**2)
print “The area is”, area

New things in this example:
print used all by iself prints out an empty line
== checks whether two things are equal, as opposed to =, which assigns the value on the right side to the variable on the left. This is an important distinction!
** is Python’s power operator — thus the squared radius is written radius**2.
print can print out more than one thing. Just separate them with commas. (They will be separated by single spaces in the output.)

The program is quite simple: It asks for a number, which tells it whether the user wants to calculate the area of a rectangle or a circle. Then, it uses an if-statement (conditional execution) to decide which block it should use for the area calculation. These two blocks are essentially the same as those used in the previous area examples. Notice how the comments make the code more readable. It has been said that the first commandment of programming is: “Thou shalt comment!” Anyway — it’s a nice habit to acquire.
Exercise 1

Extend the program above to include area calculations on squares, where the user only has to enter the length of one side. There is one thing you need to know to do this: If you have more than two choices, you can write something like:
if foo == 1:
# Do something…
elif foo == 2:
# Do something else…
elif foo == 3:
# Do something completely different…
else:
# If all else fails…

Here elif is a mysterious code which means “else if” :). So; if foo is one, then do something; otherwise, if foo is two, then do something else, etc. You might want to add other options to the programs too — like triangles or arbitrary polygons. It’s up to you.
Loops

Sequential execution and conditionals are only two of the three fundamental building blocks of programming. The third is the loop. In the previous section I proposed a solution for checking if the spam was heated, but it was quite clearly inadequate. What if the spam wasn’t finished the next time we checked either? How could we know how many times we needed to check it? The truth is, we couldn’t. And we shouldn’t have to. We should be able to ask the computer to keep checking until it was done. How do we do that? You guessed it — we use a loop, or repeated execution.

Python has two loop types: while-loops and for-loops. For-loops are perhaps the simplest. For instance:
for food in “spam”, “eggs”, “tomatoes”:
print “I love”, food

This means: For every element in the list “spam”, “eggs”, “tomatoes”, print that you love it. The block inside the loop is executed once for every element, and each time, the current element is assigned to the variable food (in this case). Another example:
for number in range(1,100):
print “Hello, world!”
print “Just”, 100 - number, “more to go…”

print “Hello, world”
print “That was the last one… Phew!”

The function range returns a list of numbers in the range given (including the first, excluding the last… In this case, [1..99]). So, to paraphrase this:

The contents of the loop is executed for each number in the range of numbers from (and including) 1 up to (and excluding) 100. (What the loop body and the following statements actually do is left as an exercise.)

But this doesn’t really help us with our cooking problem. If we want to check the spam a hundred times, then it would be quite a nice solution; but we don’t know if that’s enough — or if it’s too much. We just want to keep checking it while it is not hot enough (or, until it is hot enough — a matter of point-of-view). So, we use while:
# Spam-cooking program

# Fetch the function *sleep*
from time import sleep

print “Please start cooking the spam. (I’ll be back in 3 minutes.)”

# Wait for 3 minutes (that is, 3*60 seconds)…
sleep(180)

print “I’m baaack :)”

# How hot is hot enough?
hot_enough = 50

temperature = input(“How hot is the spam? “)
while temperature < hot_enough:
print “Not hot enough… Cook it a bit more…”
sleep(30)
temperature = input(“OK. How hot is it now? “)

print “It’s hot enough - You’re done!”

New things in this example…
Some useful functions are stored in modules and can be imported. In this case we import the function sleep (which sleeps for a given number of seconds) from the module time which comes with Python. (It is possible to make your own modules too…)
Exercise 2

Write a program that continually reads in numbers from the user and adds them together until the sum reaches 100. Write another program that reads 100 numbers from the user and prints out the sum.
Bigger Programs — Abstraction

If you want an overview of the contents of a book, you don’t plow through all pages — you take a look at the table of contents, right? It simply lists the main topics of the book. Now — imagine writing a cookbook. Many of the recipes, like “Creamy Spam and Macaroni” and “Spam Swiss Pie” may contain similar things, like spam, in this case - yet you wouldn’t want to repeat how to make spam in every recipe. (OK… So you don’t actually make spam… But bear with me for the sake of example :)). You’d put the recipe for spam in a separate chapter, and simply refer to it in the other recipes. So — instead of writing the entire recipe every time, you only had to use the name of a chapter. In computer programming this is called abstraction.

Have we run into something like this already? Yup. Instead of telling the computer exactly how to get an answer from the user (OK - so we couldn’t really do this… But we couldn’t really make spam either, so there… :)) we simply used input - a function. We can actually make our own functions, to use for this kind of abstraction.

Let’s say we want to find the largest integer that is less than a given positive number. For instance, given the number 2.7, this would be 2. This is often called the “floor” of the given number. (This could actually be done with built-in Python function int, but again, bear with me…) How would we do this? A simple solution would be to try all possibilities from zero:
number = input(“What is the number? “)

floor = 0
while floor <= number:
floor = floor+1
floor = floor-1

print “The floor of”, number, “is”, floor

Notice that the loop ends when floor is no longer less than (or equal to) the number; we add one too much to it. Therefore we have to subtract one afterwards. What if we want to use this “floor”-thing in a complex mathematical expression? We would have to write the entire loop for every number that needed “floor”-ing. Not very nice… You have probably guessed what we will do instead: Put it all in a function of our own, called “floor”:
def floor(number):
result = 0
while result <= number:
result = result+1
result = result-1
return result

New things in this example…
Functions are defined with the keyword def, followed by their name and the expected parameters in parentheses.
If the function is to return a value, this is done with the keyword return (which also automatically ends the function.

Now that we have defined it, we can use it like this:
x = 2.7
y = floor(2.7)

After this, y should have the value 2. It is also possible to make functions with more than one parameter:
def sum(x,y):
return x+y
Exercise 3

Write a function that implements Euclid’s method for finding a common factor of two numbers. It works like this:
You have two numbers, a and b, where a is larger than b
You repeat the following until b becomes zero:
a is changed to the value of b
b is changed to the remainder when a (before the change) is divided by b (before the change)
You then return the last value of a

Hints:
Use a and b as parameters to the function
Simply assume that a is greater than b
The remainder when x is divided by z is calculated by the expression x % z
Two variables can be assigned to simultaneously like this: x, y = y, y+1. Here x is given the value of y (that is, the value y had before the assignment) and y is incremented by one
More About Functions

How did the exercise go? Was it difficult? Still a bit confused about functions? Don’t worry — I haven’t left the topic quite yet.

The sort of abstraction we have used when building functions is often called procedural abstraction, and many languages use the word procedure along with the word function. Actually, the two concepts are different, but both are called functions in Python (since they are defined and used in the same way, more or less.)

What is the difference (in other languages) between functions and procedures? Well — as you saw in the previous section, functions can return a value. The difference lies in that procedures do not return such a value. In many ways, this way of dividing functions into two types — those who do and those who don’t return values — can be quite useful.

A function that doesn’t return a value (a “procedure”) is used as a “sub-program” or subroutine. We call the function, and the program does some stuff, like making whipped cream or whatever. We can use this function in many places without rewriting the code. (This is called code reuse — more on that later.)

The usefulness of such a function (or procedure) lies in its side effects — it changes its environment (by mixing the suger and cream and whipping it, for instance…) Let’s look at an example:
def hello(who):
print “Hello,”, who

hello(“world”)
# Prints out “Hello, world”

Printing out stuff is considered a side effect, and since that is all this function does, it is quite typical for a so-called procedure. But… It doesn’t really change its environment does it? How could it do that? Let’s try:
# The *wrong* way of doing it
age = 0

def setAge(a):
age = a

setAge(100)
print age
# Prints “0”

What’s wrong here? The problem is that the function setAge creates it own local variable, also named age which is only seen inside setAge. How can we avoid that? We can use something called global variables.

Note: Global variables are not used much in Python. They easily lead to bad structure, or what is called spaghetti code. I use them here to lead up to more complex techniques - please avoid them if you can.

By telling the interpreter that a variable is global (done with a statement like global age) we effectively tell it to use the variable outside the function instead of creating a new local one. (So, it is global as opposed to local.) The program can then be rewritten like this:
# The correct, but not-so-good way of doing it
age = 0

def setAge(a):
global age
age = a

setAge(100)
print age
# Prints “100”

When you learn about objects (below), you’ll see that a more appropriate way of doing this would be to use an object with an age property and a setAge method. In the section on data structures, you will also see some better examples of functions that change their environment.

Well — what about real functions, then? What is a function, really? Mathematical functions are like a kind of “machine” that gets some input and calculates a result. It will return the same result every time, when presented with the same input. For instance:
def square(x):
return x*x

This is the same as the mathematical function f(x)=x2. It behaves like a nice function, in that it only relies on its input, and it does not change its environment in any way.

So — I have outlined two ways of making functions: One type is more like a procedure, and doesn’t return a result; the other is more like a mathematical function and doesn’t do anything but returning a result (almost). Of course, it is possible to do something in between the two extremes, although when a function changes things, it should be clear that it does. You could signal this through its name, for instance by using only a noun for “pure” functions like square and an imperative for procedure-like functions like setAge.
More Ingredients — data structures

Well — you know a lot already: How to get input and give output, how to structure complicated algorithms (programs) and to perform arithmetic; and yet the best is still to come.

What ingredients have we been using in our programs up until now? Numbers and strings. Right? Kinda boring… No let’s introduce a couple of other ingredients to make things a bit more exciting.

Data structures are ingredients that structure data. (Surprise, surprise…) A single number doesn’t really have much structure, does it? But let’s say we want more numbers put together to a single ingredient — that would have some structure. For instance, we might want a list of numbers. That’s easy:
[3,6,78,93]

I mentioned lists in the section on loops, but didn’t really say much about them. Well — this is how you make them. Just list the elements, separated by commas and enclosed in brackets.

Let us jump into an example that calculates primes (numbers divisible only by themselves or 1):
# Calculate all the primes below 1000
# (Not the best way to do it, but…)

result = [1]
candidates = range(3,1000)
base = 2
product = base

while candidates:
while product < 1000:
if product in candidates:
candidates.remove(product)
product = product+base
result.append(base)
base = candidates[0]
product = base
del candidates[0]

result.append(base)
print result

New things in this example…
The built-in function range actually returns a list that can be used like all other lists. (It includes the first index, but not the last.)
A list can be used as a logic variable. If it is not empty, then it is true — if it is empty, then it is false. Thus, while candidates means “while the list named candidates is not empty” or simply “while there are still candidates”.
You can write if someElement in someList to check if an element is in a list.
You can write someList.remove(someElement) to remove someElement from someList.
You can append an element to a list by using someList.append(something). Actually, you can use + too (as in someList = someList+[something]) but it is not as efficient.
You can get at an element of a list by giving its position as a number (where the first element, strangely, is element 0) in brackets after the name of the list. Thus someList[3] is the fourth element of the list someList. (More on this below.)
You can delete variables by using the keyword del. It can also be used (as here) to delete elements from a list. Thus del someList[0] deletes the first element of someList. If the list was [1,2,3] before the deletion, it would be [2,3] afterwards.

Before going on to explaining the mysteries of indexing list elements, I will give a brief explanation of the example.

This is a version of the ancient algorithm called “The Sieve of Erastothenes” (or something close to that). It considers a set (or in this case, a list) of candidate numbers, and then systematically removes the numbers known not to be primes. How do we know? Because they are products of two other numbers.

We start with a list of candidates containing numbers [2..999] — we know that 1 is a prime (actually, it may or may not be, depending on who you ask), and we wanted all primes below 1000. (Actually, our list of candidates is [3..999], but 2 is also a candidate, since it is our first base). We also have a list called result which at all times contains the updated results so far. To begin with, this list contains only the number 1. We also have a variable called base. For each iteration (“round”) of the algorithm, we remove all numbers that are some multible of this base number (which is always the smallest of the candidates). After each iteration, we know that the smallest number left is a prime (since all the numbers that were products of the smaller ones are removed — get it?). Therefore, we add it to the result, set the new base to this number, and remove it from the candidate list (so we won’t process it again.) When the candidate list is empty, the result list will contain all the primes. Clever, huh?

Things to think about: What is special with the first iteration? Here the base is 2, yet that too is removed in the “sieving”? Why? Why doesn’t that happen to the other bases? Can we be sure that product is always in the candidate list when we want to remove it? Why?

Now — what next? Ah, yes… Indexing. And slicing. These are the ways to get at the individual elements of Python lists. You have already seen ordinary indexing in action. It is pretty straightforward. Actually, I have told you all you need to know about it, except for one thing: Negative indices count from the end of the list. So, someList[-1] is the last element of someList, someList[-2] is the element before that, and so on.

Slicing, however, should be new to you. It is similar to indexing, except with slicing you can target an entire slice of the list, and not just a single element. How is it done? Like this:
food = [“spam”,”spam”,”eggs”,”sausages”,”spam”]

print food[2:4]
# Prints “[‘eggs’, ‘sausages’]”
More Abstraction — Objects and Object-Oriented Programming

Now there’s a buzz-word if ever there was one: “Object-oriented programming.”

As the section title suggests, object-oriented programming is just another way of abstracting away details. Procedures abstract simple statements into more complex operations by giving them a name. In OOP, we don’t just treat the operations this way, but objects. (Now, that must have been a big surprise, huh?) For instance, if we were to make a spam-cooking-program, instead of writing lots of procedures that dealt with the temperature, the time, the ingredients etc., we could lump it together into a spam-object. Or, perhaps we could have an oven-object and a clock-object too… Now, things like temperature would just be attributes of the spam-object, while the time could be read from the clock-object. And to make our program do something, we could teach our object some methods; for instance, the oven might know how to cook the spam etc.

So — how do we do this in Python? Well we can’t just make an object directly. Instead of just making an oven, we make a recipe describing how ovens are. This recipe then describes a class of objects that we call ovens. A very simple oven class might be:
class Oven:
def insertSpam(self, spam):
self.spam = spam

def getSpam(self):
return self.spam

Now, does this look weird, or what?

New things in this example…
Classes of objects are defined with the keyword class.
Class names usually start with capital letters, whereas functions and variables (as well as methods and attributes) start with lowercase letters.
Methods (i.e. the functions or operations that the objects know how to do) are defined in the normal way, but inside the class block.
All object methods should have a first parameter called self (or something similar…) The reason will (hopefully) become clear in a moment.
Attributes and methods of an object are accessed like this: mySpam.temperature = 2, or dilbert.be_nice().

I would guess that some things are still a bit unclear about the example. For instance, what is this self thing? And, now that we have an object recipe (i.e. class), how do we actually make an object?

Let’s tackle the last point first. An object is created by calling the classname as if it were a function:
myOven = Oven()

myOven now contains an Oven object, usually called an instance of the class Oven. Let’s assume that we have made a class Spam as well; then we could do something like:
mySpam = Spam()
myOven.insertSpam(mySpam)

myOven.spam would now contain mySpam. How come? Because, when we call one of the methods of an object, the first parameter, usually called self, always contains the object itself. (Clever, huh?) Thus, the line self.spam = spam sets the attribute spam of the current Oven object to the value of the parameter spam. Note that these are two different things, even though they are both called spam in this example.
Answer to Exercise 3

Here is a very concise version of the algorithm:
def euclid(a,b):
while b:
a,b = b,a % b
return a

Network Hacking

By Akram on 10:28:00 AM

comments (0)

Filed Under:

There are many types of networks, from Internet networks to printer networks and they can all be illegally accessed and hacked in various ways. Some networks can be relatively simple to gain supervisor or admin access in order to access private files or to change administrator tools for others that normally use the networks involved. Hacking can be used to crash networks, upload viruses and a multitude of other illegal activities that range from irritating to costly and incredibly harmful to the system.

Hacking a network of computers in a company for example can disrupt work and incur major costs in the repair of the network overall including the installation of a higher range of security methods. The time wasted and costs associated with this can be harmful to a company especially if private or financial files have been accessed. The repercussions can be extreme in such cases. Many companies now hire people with hacking skills in order to asses how vulnerable their networking systems are and also to figure out what security methods will give them the best results. The more prolific the company, the more security is necessary in order to protect their systems.

Mobile Hacking

By Akram on 10:28:00 AM

comments (0)

Filed Under:

Although hacking is generally associated with computers, mobile phones are also at risk from hackers. Mobile phones that use wap use wireless connections to access the Internet and these connections can be easily hacked into. Phone companies have also been hacked before with the hackers gaining access into thousands of mobile phone user records. There are different ways to hack into mobile phones depending on the type of wap connection they use. Bluesnarfing for example is the specific method of hacking into phones that use Bluetooth connections. A hacker would do this in order to copy all of the information on the phone such as the contacts for instance.

People are using their mobile phones to send increasingly private information as technology enables the use of email, etc on mobile phones. Hacking of a mobile phone can cover the hacking that involves tricking the mobile phone company into providing free texts, calls or wap. It can also include the cloning of sims in order to charge calls under another person's name and number or even the use of hacking to gain access to a person's confidential information such as passwords or various account details.

Linux Hacking

By Akram on 10:27:00 AM

comments (0)

Filed Under:

Linux is an operating system that can be used on a computer. The Linux operating systems are run by the kernel which rules over the programs and hardware components. It is at the core of the system and has access to services that no other program has access to. Special tasks can only be carried out by going through the kernel first. The kernel does not allow programs to perform illegal operations and so this is why the kernel needs to be accessed in order for hacking of the Linux system to be successful.

Once a hacker gets into the Linux kernel, they can potentially change the system options in order to allow programs perform all sorts of illegal operations. It is important for the hacker to force the Linux operating system to allow these actions in order to successfully break into all areas of the system. Hacking can be used for many different reasons, from accessing private information stored in files to noting keystrokes which can be used to find out a person's passwords.

WI FI Hacking

By Akram on 10:27:00 AM

comments (0)

Filed Under:

WI FI is a way of using wireless Internet, cables are not required and it was originally designed for use with laptops and mobile phones. WI FI has become increasingly popular due to the lack of cables and cheap installation costs. Many people are now using WI FI instead of the usual Internet connections. The disadvantages of WI FI include the fact that WI FI can be easily hacked into. The most common form of encryption for WI FI can be simply hacked even if it is configured properly. There are people who travel around with laptops looking for insecure WI FI connections that can be hacked into.

Using these methods, a person could potentially "steal" use of another person's internet and break into files with ease. A great many hackers are using other people's WI FI connections in order to break into secure access files as the connection can be harder to trace to them as it is wireless and in another person's name. This type of hacking can be serious if companies that hold important files use WI FI as the sensitive information involved in the files could be easily accessed by a hacker. There have already been convictions of this type of hacking.

Wi-Fi

By Akram on 10:27:00 AM

comments (0)

Filed Under:

Wi-Fi is short for wireless Fidelity and uses the same radio frequencies as Bluetooth. The only difference between Wi-Fi and Blue tooth is the power. Wi-Fi is sometimes called wireless Ethernet.

It is used in creating networks and its configuration is more complicated and difficult than traditional networks. Wi- Fi is increasingly popular as it offers seamless connectivity without the need of wires. This feature is included in almost all laptops and many times if offered as a USB plug and play system that enables Wi-Fi on a desktop computer.

Wi-Fi certifies that network devices comply with the IEEE 802.11 wireless Ethernet standard and is widely used in a variety of hand held devices. The next generation is WiMAX and is much touted to be the replacement to WI-Fi.

Presently Wi-fi is being expanded to include a wide variety of applications in LAN's, VoIP, gaming, DVD players, digital cameras and even intelligent transport systems on cars on highways so that safety and other data can be transmitted to them while they are traveling

Intrusion Prevention

By Akram on 10:26:00 AM

comments (0)

Filed Under:

Intrusion detection system detects any intrusion. The intrusion prevention system is an automated reaction by the system to block any attack or attempted intrusion into the computer system or into the computer network.

The intrusion detection system is just a monitoring system. It sniffs packets of a switch port and logs information or generates alerts. The intrusion prevention system is one step further. It is an active intermediary like a firewall intercepting packet and forwarding them on the network only if it is ok. It blocks attacks in real time and act like an advanced firewall. Most Intrusion prevention systems contain firewall software as well. The latest generation firewalls shares it functionality of deep stateful packet inspection with an intrusion prevention engine to successfully thwart attacks to a system of private network.

Intrusion prevention systems are either host based or network based where the network based intrusion prevention systems has a larger and more modular attack prevention system when compared to a host based system

Content based as well as rate based intrusion prevention systems help in the more modern type of attacks like denial of service or distributed denial of service attacks.

Intrusion Detection

By Akram on 10:26:00 AM

comments (0)

Filed Under:

In the ambit of information security and its growing need to be protected from any attempts to destroy it or duplicate it or copy it attempts have been made to provide detection first and later prevention.

Intrusion detection is done by a variety of means the oft followed means being a manual check to detect actions that may have compromised the confidentiality, integrity or availability of the given resource. Manual detection is usually done by checking the log files and system for evidence of intrusions.

This process has been automated to an extent and is still being perfected. A system that detects intrusion automatically and alerts the administrator is called an intrusion detection system. They are designed on statistical modeling of traffic and application data to detect anomalies that happen. These can be either host based or network based. If is it host based on a single machine only it is person intrusion detection system else it is a network intrusion detection system.

The types of alerts generated by an 'IDS' is determined by how serious the intrusion is. It could generate a log of the relevant information to a file or data base or can even generate an email or a message to a pager or mobile phone.

Hacking Tools

By Akram on 10:25:00 AM

comments (0)

Filed Under:

Many of the hackers are essentially grate tool users. They exploit the software using these tools. It is not that much the art of programming. They break a trivial target program using the software to manipulate it in a number of ways.

There are hundreds of easy to use tools that may be used to get IP addresses of the remote system. These tools are also used to scan the various IP ports to see if any port can be used to enter the target system. Adware, Spyware, worms and Trojan horses have been deployed into target systems using these tools. These then generate greater vulnerabilities so that the other tools the hacker has could be used to other purposes. Wireless devices have not been spared either.

Most other smart hackers use disassemblers, scripting engines and input generators to find out more about a software or application. They find bugs in the give software and use these loop holes to gain access to systems where ever the software is installed. A few tools are DeCSS, Cold life, Intelli Tamper, John the ripper, and Nmap for Windows.

Hacking Software

By Akram on 10:25:00 AM

comments (0)

Filed Under:

Hacking software is a wide variety of software sold in the market that makes hacking easy. It is as easy as the advert says it is. Click and hack.

Many of the hacking software features automated tools. These tools allow fraudsters to make major as well as minor changes on a computer or on a network of hacked PC's. All these can be achieved simply by the click of a mouse or drag and drop menu or even a pull down menu. You can even delete or add files on an infected computer

Many of these automated tools also provide you all the passwords stored on the computer. This may include passwords of user, or stored forms that contain personal information. Some softwares contain sophisticated algorithms to hack and crack encrypted password files. John the Ripper cracks MD% passwords in almost no time. Rainbow crack is another password tracker. Bug bear and IRC bots help you become moderators for a chat room or even pose as officials of the company. Many of the softwares are updated regularly with tools and scripts that could help you spy and hack into people's computers. Packet storm enables you to take a peek into IP packets and extract information that matters.

Hacking Run

By Akram on 10:24:00 AM

comments (0)

Filed Under:

A Hacking run is usually a hacking session where the hacker is attempting to get past a particular security level or the hacker is attempting to retrieve information and is limited by the entire volume of information. In such cases the hack session would usually extend beyond a day, sometimes twelve hours on a stretch.

In many case such a long period of hacking could spell danger as the persons sitting on the computer that is hacked or even the network that is being hacked into could get suspicious. However hackers are becoming cleverer and as such are using alternate methods of botnets and spreading the source of hacking so that they are not detected. Hacking by these methods would involve placing Trojans on unsuspecting persons computers and using these computer systems in the long sessions of hacking.

What ever the methods used by hackers they are always limited by the bandwidth of the internet and cannot get the ultimate out of it. The traffic and bandwidth factors reduce the speed of data transfer and hence may contribute to an extended session or hacking run. The best means of preventing a hacking run is to be security conscious and regularly look at the outgoing data and incoming data to find out if anyone is trying to access your network or system.

Hacking Programs

By Akram on 10:24:00 AM

comments (0)

Filed Under:

Hacking programs are the many ready made programs that help in the process of hacking.

These programs are mainly used as a reconnaissance so that the hacker gathers the important information before making the attack. The various reconnaissance programs are a part of the hacker's repertoire of tools. For a personal computer and networks it is mainly finding the various loopholes in the internet protocols and the security measures on the system. Port mapping is one method used. When the various attempts to get at your system through IP port fails then the method used would be sending you a mail or getting information through social engineering techniques that could help in further attacks.

Hacking programs are mostly available as spy ware the help others gather data from your system. It gives information by sending out data from key loggers and even screens shots. This might help the hacker to determine what other hacking programs to use to get control of your network or machine.

Some of the hacking programs are John the Ripper, Rainbow crack, bug bear, DeCSS 1.2b , Coldlife 4.0, PCHelps Network Tracer, Backdoor.IRC.ColdLife.30 , NMap Win 1.2.12 . Some of these program are used to hack websites as well.

Hacking

By Akram on 10:23:00 AM

comments (0)

Filed Under:

Hacking can be used with respect to a computer system as its use without a specific constructive purpose and without proper authorization. Hacking is more mastery over technology.

Hacking would include knowledge over a variety of subjects. In Hacking the telephone it could mastery over the various modes of communication which would relate to the subject of electronics. It would mean mastery over the principle of electricity to be able to tap the telephone and know how electricity can help in its working. It would also means having knowledge of switching systems and Internet Protocol wherein some telephonic communications take place.

For a computer with a variety of software and hardware it essentially would mean getting complete or in-depth knowledge of the any particular aspect that could help you go beyond the capabilities of what the software or hardware was intended. This manipulation of technology to take it beyond its inherent capacity is what hacking is all about. This can happen in a public kiosk where the security restriction of the software is by passed so that a web site or some computer may be accessed. It can also happen from a corporate or home environment. There are some who do it just for the heck of it. There are others who make money out of hacking. Therefore the various distinctions among hacking is white hat hacking, Black hat hacking and grey hat hacking. Blue hat hacking is the recent addition to paid security analysts and consultant finding loop holes in a program.

Hackers Bible

By Akram on 10:23:00 AM

comments (0)

Filed Under:

Hacker's Bible refers to a wide range of books on how to learn the working of any particular system. It gives information on how to disassemble the system, tweak the various parts and fine tune the working and reassemble the system back.

Some of the famous hacker's bibles are scanner hacker's bible, cellular hackers Bible, cable hacker's bible, CGI hacking bible, etc�The hackers Bible is essentially a guide getting you through the various processes. Its possible source is the Hacker Quarterly which focuses on different aspects of technology. The bible talks about how to augment the capacities of any electronic apparatus.

Another possible source of the hackers Bible is the Jargon file. The Jargon file is a complete glossary of hacker slang that has been collected from the old days of Arpanet. Thee hacker slang is provided so that those who search for hacking terms can do so discretely with the general public being ignorant of what is actually going on.

Hacker

By Akram on 10:22:00 AM

comments (0)

Filed Under:

Hacker is now used in a broader range which initially meant a person who had mastery over computers. Now you find ATM hackers, Game equipment hackers, Mobile hackers and so on.

In most connotations it means some one who can analytically take apart any system understand its working and put it back again. Hackers initially were terms used for computer Geeks at MIT where they would use the programming and show case their knowledge by exploits.

The Mass media now uses the term hacker very synonymously with criminals who uses programming and mastery over aspects of the computer system and usually has illegal means and ends to do the task.

To differentiate the mass media used term and the original term a new plethora of definitions have emerged. These are white hat hacker, grey hat hacker, blue hat hacker, black hat hacker, cracker, script kiddie, Google hacker and so on.

White hat hacker is a person who hacks for altruistic reasons, Black hat hacker does it for dubious reasons and the Grey hat hacker is in between with an ambiguous set of ethics.

Blue Hat is hackers that are referred when a testing is done on particular software before it is launched. These are security experts or consulting firms looking for exploits in the software so that it can be closed before the launch.

Google Hacking

By Akram on 10:22:00 AM

comments (0)

Filed Under:

Google Hacking is usually referred to as the art of using of Google search to get information from computers that are not secure. This is usually used to detect websites for vulnerabilities and various exploits. These vulnerabilities and exploits are then used to get private sensitive information like credit card numbers, social security numbers and passwords.

Google hacking is usually done using advance search operators provided by Google so that you could get sensitive data. Though the advance search can be carried out on Yahoo and MSN or any other search engine for that matter it has been named Google hacking as it was first shown that it could be done using Google.

The process involves searching for specific strings of text like "passwords" or filenames of the various files where the passwords are stored. The publicity of Google hacking is due to the wide number of presentations and instructional guide provided by Johnny Long and the others that have proved the various methods Google can be used to hack websites and networks. By using the File type advance search operator some hackers have been able to get the document files from the defense and space establishment of the United States.

Game Hacking Tools

By Akram on 10:21:00 AM

comments (0)

Filed Under:

There are many games some of these are played online and in real time. The levels of difficulty and skill in most games are meant to provide the player a thrill. But then some players always get stuck at a particular level and are not able to pass that layer.

Game hacking tools are various methods to cheat a software game so that the difficult layer is by passed. It is also used to enhance scores, proved more life, or even provide additional facilities or features offered during the playing process illegally.

Game hacking tools are usually software programs provided on a number of gaming sites with instruction on how to run the programs and where to place the code in the existing gaming program. Game hacking tools are not only available for computer based gaming systems but even for stand alone video console gaming like the XBOX, PSP and even Nitendo.

These games hacking tools are also known as game cheats and are available for a wide variety of games. The game hacking tools are usually created through the process of reverse engineering or through the use of hex editors to find the various parts of the code and tweaking them so that the purpose of Hacking the game is accomplished.

Firewall

By Akram on 10:21:00 AM

comments (0)

Filed Under:

Firewall is a means of preventing intruders onto a computer or into a network. It is a primary method to block traffic into and out of a computer system or private network. Firewalls on a personal computer are known as personal firewalls while those protecting the private network from intrusions from the internet are known as internet firewalls.

A firewall can be a combination of hardware and software. It can also be software only. When installed on a private network it is usually done on the router. A router may be a hardware device which has some intrusion detection and prevention software installed on it. It can also be software run on a computer that acts as the Private Network's gateway to the internet.

Windows firewall keeps a check on spy ware activities on your Personal Computer. You can get a firewall appliance that can filter out unwanted packets coming into your network. Many techniques are used in fire walling these include Stateful inspection of inbound packets to identify whether the inbound packets are requested by a user. Network Address translation where only one IP address is provided to interface to the world outside and this translates and controls the other computers on the Network. This is part of Windows Internet connection Sharing. Packet filtering and Proxy server are other techniques used.

Ethical Hacker

By Akram on 10:20:00 AM

comments (0)

Filed Under:

Can a hacker be considered to be ethical? What then does the term mean?

Ethical hacker is a person who legally attempt to break into a computer system. This is done with prior consent of the owner of the system or network. The purpose of breaking into the computer system is to check if there is any vulnerability and if the system can be hacked into by others. This is also called penetration testing.

Many ethical hackers are hackers who have turned into consultants to help prevent criminal intent hacking. Ethical hackers now have a certification called certified ethical hacking which makes them take an oath to a code of ethics in terms of hacking into others system.

Ethical hackers are also known as white hat hackers. Though the terms is meant to represent the good guys only they are another group of hackers who are known as grey hat hackers who sometimes function as ethical hackers and at other times are bad guys following black hat hacking methods.

Email Hacking

By Akram on 10:20:00 AM

comments (0)

Filed Under:

Email hacking seems to be a misnomer as most of the sites that provide you email services are rather secured from hacking. They provide you secure connection when you log in.

Email hacking is all about how you use your mail account. It is not the web server that can be hacked but the email client on your system. This provides the hacker the mail you have saved on the local folders including the address book and personal details of your contacts in the address list.

Another means is to get the passwords of your email account. This can be done in a variety of ways. One method is to get the password from the file that is stored on the PC. This file contains the password only if you chose the option of "remembering your email ID and password" when you log in.

The second method to email password hacking is to try the various possible combinations of passwords you are likely to use. This is a guess work and is more or less done on personal information about you like dates of birth, anniversaries, Names of your wife's, kids or pets since these are commonly used as passwords. The third method is a key logger software.

Credit Card Hacking

By Akram on 10:19:00 AM

comments (0)

Filed Under:

Credit card hacking is essentially identity theft. It is deliberate use of the identity of someone else so that you could get a great deal of financial gain. Identity theft is a serious breach of privacy. Identity theft can be carried out by a number of methods.

One it can be done online by a spurious website offering you some services and getting your personal details to carry out their modus operandi later on. Sometimes these site follow credit hijacking where they continue to bill you small amount monthly so that you will not recognize the difference.

Two it can be done with special electronic gadgets that are able to read your credit card and reproduce the information on fake cards. They also use a spy Camera to record your PIN numbers or code for authentication. Then they use this information for illegal withdrawals of cash or purchase of merchandise.

Three credit card hacking involves hacking into computers that contain all the data of credit card users and the availing that data for fraudulent means.

Cracking

By Akram on 10:19:00 AM

comments (0)

Filed Under:

Cracking is the aspect of breaking into a computer system. It can also be the act of breaking copy protection of a commercial program. It can be removing the license that copyrights a software or program. Usually it is a patch or a series of instruction to remove the copy protection so that the features of the program are changed from the demo version to full time version. In some ways it is also termed as an exploit that cracker accomplish with tools that aid them.

Cracking is done by the spread of root kit Trojans so that accessing is gained into other systems with out much of programming effort. The same goes to removing the license or copyright protections from programs. They think themselves as better than the programmers who have created the copyright or license protections but this effort is mediocre since they use automated tools. Hence those following these methods are not given the term hackers; rather they are termed as crackers.

Computer Hacking

By Akram on 10:18:00 AM

comments (0)

Filed Under:

Computer hacking is a term used to represent illegal access to a resource of another persons computer. Hacking of a computer requires proficiency in programming, knowledge of the operating system or applications, and knowledge of hacking tools.

Initially the term hacker was used approvingly to acknowledge prowess in programming skills. Now it is a derogatory term used to denote criminals who have gained access to other computers and stolen data or personal information. This information is then sold to third parties for a sum. The term hacker was more than often wrongly used by the press and the term has hence stuck to the derogatory connotation of a programming.

Computer hacking is now performed using a variety of software's. These softwares can be downloaded from a number of sites. Using these softwares it is become easy for even teenagers who do not have programming knowledge to gain access to a computer that is not secure enough. Most teenagers take to computer hacking just to make a publicity statement among their peer group. This has landed quite a few teenagers in trouble as they had gone from proving themselves to making a fast buck in the process.

Crackers

By Akram on 10:16:00 AM

comments (0)

Filed Under:

Computer hacking is a term used to represent illegal access to a resource of another persons computer. Hacking of a computer requires proficiency in programming, knowledge of the operating system or applications, and knowledge of hacking tools.

Initially the term hacker was used approvingly to acknowledge prowess in programming skills. Now it is a derogatory term used to denote criminals who have gained access to other computers and stolen data or personal information. This information is then sold to third parties for a sum. The term hacker was more than often wrongly used by the press and the term has hence stuck to the derogatory connotation of a programming.

Computer hacking is now performed using a variety of software's. These softwares can be downloaded from a number of sites. Using these softwares it is become easy for even teenagers who do not have programming knowledge to gain access to a computer that is not secure enough. Most teenagers take to computer hacking just to make a publicity statement among their peer group. This has landed quite a few teenagers in trouble as they had gone from proving themselves to making a fast buck in the process.

Bluetooth Hacking

By Akram on 10:16:00 AM

comments (0)

Filed Under:

Blue tooth is a wireless technology meant for Personal Area Networks. It is a standard for short range communications (10 meters and a bit more). The communication may be point to point /multipoint. Used mainly in cellular phones, PDA's and laptops this technology has been subject to hacking by unscrupulous persons.

After it was proved that it is devices were subject to theft of data the term bluesnarfing came for be used to signify the process of illegal retrieving data from calendar, contact list, emails and text messages of a mobile phone.

Blue jacking is a simple act of sending a Bluetooth device an unsolicited message like a vcard with name and message field. It is used for bluedating and bluechatting. This is done using the OBEX protocol.

With the various attempt and proof of the concept that Bluetooth devices can be bluejacked security in the communication between wireless devices have been enabled. You could also be on the safe side by turning off your Bluetooth connection when you do not require it.

Atm Hacking

By Akram on 10:14:00 AM

comments (0)

Filed Under:

An Automatic Teller Machine is a machine that is used for communication with a financial institution/ bank to provide a secure method for dispensing money. It replaces the teller at the bank and so is called Automatic teller machine.

The process is simple you insert a personally coded card as a means of authentication. The ATM communicates with the bank or credit card account and checks the balance in your account. Then it allows you to pay your bill or transfer or even accordingly dispenses the amount requested for.

ATM hacking is related to getting your personally coded information from the machine by fraudulent means. This is done by placing a spy camera so that the Personal Identification number (PIN) is recorded by the crooks.

The Code number on your card is obtained by a reader which reads the magnetic strip on your card and transmits it to the nearby receiver where it is recorded by the hacker.

With this information the hacker creates a card with your code and then wi thdraws money from your account using the PIN recorded through the Spy Cam.

There are other methods of fraud related to ATM machines. A computer geek could get the Administrator password of the ATM and change the nature of the currency it holds. For example he could alter the $10 bills with $50 thus getting S50 bills instead of $10. SO if he were to enter $100 the machine would dispense $500 in ten $50 bills instead of $100 ten $10 bills.