2. SMB: The Server Message Block Protocol


 

[Buy the Book!]

  

2.1 A Little Background on SMB

email
From: Steven French, Senior Software Engineer, IBM
To: Chris Hertel

Chris,

Hope things are going well in the cold north ...

I thought the following info would be interesting to you. I met the original "inventor" of SMB a few years ago - Dr. Barry Feigenbaum - who back in the early 80's was working on network software architecture for the infant IBM PCs, working for IBM in the Boca Raton plant in Florida. He mentioned that it was first called the "BAF" protocol (after his initials) but he later changed it to SMB. In the early DOS years IBM and Microsoft (with some input from Intel and 3Com) contributed to it but by the time of the first OS/2 server version (LANMAN1.0 dialect and later) Microsoft did much of the work (for "LAN Manager" and its relatives).
 

Like NetBIOS, the Server Message Block protocol originated a long time ago at IBM. Microsoft embraced it, extended it, and in 1996 gave it a marketing upgrade by renaming it "CIFS".

Over the years there have been several attempts to document and standardize the SMB/CIFS protocol:
 

Change is the essential process
of all existence.
-- Spock (Leonard Nimoy)
Let That Be Your Last
Battlefield
,
stardate 5730.2
  
  • Microsoft keeps an archive of documentation covering older versions of SMB/CIFS. The collection spans a period of roughly ten years, starting at about 1988 with the SMB Core Protocol. The collection is housed, it seems, on a dusty FTP server in a forgotten corner of a machine room somewhere in the Pacific Northwest. The URL for the CIFS archive is ftp://ftp.microsoft.com/developr/drg/CIFS/.

  • In 1992, X/Open (now known as The Open Group) published an SMB specification titled Protocols for X/Open PC Interworking: SMB, Version 2. The book is now many years out of date and SMB has evolved a bit since its publication, yet it is still considered one of the best references available1. The Open Group is a standards body so the outdated version of SMB described in the X/Open book is, after all, a standard protocol.

  • A few years later, Microsoft submitted a set of CIFS Internet Drafts to the IETF (Internet Engineering Task Force), but those drafts were somewhat incomplete and inaccurate and they were allowed to expire. Microsoft's more recent attempts at documenting CIFS (starting in March, 2002) have been rendered useless by awkward licensing restrictions, and from all accounts contain no new information2. The expired IETF Internet Drafts (by Paul Leach and Dilip Naik) are still available from the Microsoft FTP server described above and other sources around the web.

  • The CIFS Working Group of the Storage Network Industry Association (SNIA) has published a CIFS Technical Reference based on the earlier IETF drafts. The SNIA document is neither a specification nor a standard, but it is freely available from the SNIA website.

Without a current and authoritative protocol specification, there is no external reference against which to measure the "correctness" of an implementation, and no way to hold anyone accountable. Since Microsoft is the market leader, with a proven monopoly on the desktop, the behavior of their clients and servers is the standard against which all other implementations are measured.
 

You knew the job was
dangerous when you took it.
-- Super Chicken
Jay Ward and Bill Scott,
ABC TV, 1967-1968
  

Jeremy Allison, the Samba Team's First Officer3, has stated that "The level of detail required to interoperate successfully is simply not documentable". One reason that this is true is that Microsoft can "enhance" SMB behavior at will. Combined with the dearth of authoritative references, this means that the only criteria for a well-behaved SMB implementation is that it works with Microsoft products. As a result, subtle inconsistencies and variations have crept into the protocol. They are discovered in much the same way that a dog-owner discovers poop in the yard in springtime when the snow melts4.

    
Many people dread spring chores, but spring also brings the flowers. The children play, the dog chases a butterfly, the birds sing...and it all seems suddenly worthwhile. Likewise with the work we have ahead. Things are not really too bad, once you've gotten started.

2.1.1 Getting Started

This part of the book will cover the basics of SMB, enumerate and describe some of the SMB message types (commands), discuss protocol dialects, give some details on authentication, and provide a few examples. That should be enough to help you develop a working knowledge of the protocol, a working SMB client, and possibly a simple server.

Bear in mind, though, that SMB is more complex and less well defined than NBT. In the NBT section it was possible to describe every message type and provide a comprehensive review of the entire NBT protocol. It is not practical to cover all of SMB in the same way. Instead, the goal here is to explain the basics of SMB, provide details that are missing from other sources, and describe how to go about exploring SMB on your own. In other words, the goal is to develop understanding rather than simply providing knowledge.

The textbook for this class is the latest version of the SNIA CIFS Technical Reference. Additional sources are listed in the References section near the end of this book. The most important tool, however, is probably the protocol analyzer. Warm up your copy of Ethereal or NetMon, and get ready to do some packet shoveling.

2.1.2 NBT or Not NBT

Before we actually start, there is one more thing to mention: The SMB protocol is supposed to be "transport independent". That is, SMB should work over any reliable transport that meets a few basic criteria. NBT is one such transport, but SMB does not really require the NetBIOS API. It can, for instance, be run directly over TCP/IP.

Just for fun, we will refer to SMB over TCP/IP without NBT as "naked" or "raw". When running naked, SMB defaults to using TCP port 445 instead of the NBT Session Service port (TCP/139). Windows2000, WindowsXP, and Samba all support raw transport, but the large number of "legacy" Windows clients still in use suggest that NBT will not go away any time soon.

Other than the new port number, there are only two notable changes between NBT and naked transport. The first is that naked transport does not make use of the NBT SESSION REQUEST and POSITIVE SESSION RESPONSE messages. The second is that the two transports interpret the SESSION MESSAGE header a bit differently.

Recall (from section 1.6) that the NBT Session Service prepends a four-byte header to each SESSION MESSAGE, like so:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
0 (zero) <reserved> LENGTH (17 bits)

The LENGTH field, as shown, is 17 bits wide5. Raw TCP transport also prepends a four-byte header, but the full 24 bits are available for the LENGTH:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
0 (zero) LENGTH (24 bits)

 

Your mileage may vary.
-- advertiser's disclaimer

  

Appendix B of the SNIA CIFS Technical Reference is the only source that was found which clearly shows the naked transport LENGTH field as being 24 bits wide. 24 bits translates to 16 megabytes, though, and that's a bigbunch--more than is typically practical. Fortunately, the actual maximum message size is something that is negotiated when the client and server establish the session.

When we discuss the SMB messages themselves we will ignore the SESSION MESSAGE headers, since they are part of the transport, not the SMB protocol.


2.2 An Introductory Tour of SMB

We will start with a quick museum tour of SMB. Our guide will be the venerable Universal Naming Convention (UNC). You may remember UNC from the brief introduction way back in section 1.1. UNC will provide directions and point out highlights along the tour.

Please stay together, everyone.

The UNC directions are presented in terms of a path, much like the Uniform Resource Identifier (URI) paths that are used on the World Wide Web. To explain UNC, let us first consider something more modern and familiar:

http://ubiqx.org/cifs/index.html

That string is in URI syntax, as used by web browsers, and it breaks down to provide these landmarks:

http == The protocol to use.
ubiqx.org == The name of the server.
cifs == The directory path.
SMB.html == The file name.

The landmarks guide us along a path which eventually leads us to the file we wanted to access.

The SMB protocol pre-dates the use of URIs and was originally designed for use on LANs, not internetworks, so it naturally has a different (though surprisingly similar) way of specifying paths. A Universal Naming Convention (UNC) path comparable to the URI path above might look something like this:

\\ubiqx\cifs\SMB.html

...and would parse out like this:

ubiqx == The name of the server.
cifs == The directory path.
SMB.html == The file name.

Very similar indeed.
 

The devil is in the details.
-- Popular saying

  

One obvious difference between the two formats is that UNC doesn't provide a protocol specification. That's not because it always assumes SMB. The UNC format can support all sorts of filesharing protocols, but it is up to the underlying operating system or application to try to figure out which one to use. Protocol and transport discovery are handled by trial-and-error, with each possibility tested until something works. As you might imagine, a system with AppleTalk, NetWare, and SMB all enabled may have a lot of work to do.

The UNC format is handled natively by Microsoft & IBM's extended family of operating systems: DOS, OS/2, and Windows6. Samba's smbclient utility can also parse UNC names, but it does so at the application level rather than within the OS and it only ever tries to deal with SMB. Even so, smbclient must handle both NBT and naked transport, which can be tricky.

2.2.1 The Server Identifier

The first stop on our UNC tour of SMB is the server name field, which is really a server identifier field because it will accept addresses in addition to names. This book concerns itself with only two transports--NBT and naked TCP transport--so the only identifiers we care about are:

  • NetBIOS names,
  • DNS names, and
  • IP addresses.

NetBIOS and DNS names both resolve to IP addresses, so all three are equivalent.

Sort of...

Recall that the NBT SESSION REQUEST packet requires a CALLED NAME in order to set up an NBT session with the server. Without a correct CALLED NAME, the NBT SESSION REQUEST may be rejected (different implementations behave differently). So...

  • if the transport is NBT (not raw),
  • and the server is identified using a DNS name or IP address...

...then we're in a bit of a pickle. How do we find the correct NetBIOS name to put into the CALLED NAME field? There really is no "right" way to reverse-map an IP address to a particular NetBIOS service name. The solution to this problem involves some guessing, and it's not pretty. We will go into detail when we discuss the interface between SMB and the transport layer.

Of course, if SMB is running over raw transport then there is no NBT SESSION REQUEST message and, therefore, no CALLED NAME. In that case, the NetBIOS name isn't needed at all, which saves a lot of fuss and bother.

2.2.2 The Directory Path

A path! A path!
-- The Knights Who Say Ni
Monty Python And The
Holy Grail
, Monty Python's
Flying Circus

  

The directory path looks just like a directory path, but there is one small thing that makes it different. That thing is called the "share name".

Whenever a resource is made available (shared) via SMB it is given a share name. The share name doesn't need to be the same as the actual name of the object being shared as it exists on the server. For example, consider the directory path below:

/dogs/corgi/stories/jolyon/

Suppose we just want to share the /stories subdirectory. If we simply call it "stories" no one will know what kind of stories it contains, so we should give it a more descriptive name. We might, for example, call it "dogbytes".

The share name takes the place of the actual directory name when the share is accessed via SMB. If the server is named "petserver", then the UNC path to the same directory would be:

\\petserver\dogbytes\jolyon\

As shown in figure 2.1, there can be more than one share name pointing to the same directory and access rules may be applied on a per-share basis. The idea is similar, in some ways, to that of symbolic links (symlinks) in Unix, or shortcuts in Windows. The share is a named pointer--with its own set of attributes--to the object being made available by the server.

[Figure 2.1]

2.2.3 The File

This is the last stop on our quick UNC tour of SMB.

Files, like directories, should be fairly familiar and fairly straight-forward. As has been continually demonstrated, however, things in the CIFS world are not always as simple as they ought to be. Our point of interest on this part of the tour is the distinction between server filesystem syntax and semantics and client expectations...a very gnarled knot for CIFS implementors.

Consider, for example, a bunch of Windows clients connecting to an SMB server running on Linux. On the Linux system the filenames Corgi, corgi, and CORGI would all be distinct because Linux filesystems are typically case-sensitive. Windows, however, expects filenames to be case-insensitive, so all three names are the same from the Windows point of view. Thus, we have a conflict. How does a Linux server make all three files available to the Windows client?

Other difficult issues include:

  • filename lengths,
  • valid characters,
  • file access permissions, and
  • the end-of-line delimiter in text files.

These are complex problems, not easily solved. The CIFS protocol suite is not designed to be agnostic with regard to such things. In fact, CIFS goes out of its way at times to support features that are specific to DOS, OS/2, and Windows.

...and that concludes our tour. It's time to visit the gift shoppe.

2.2.4 The SMB URL

The UNC format is specific to one family of operating systems. Earlier on, though, we compared UNC with the more portable and modern URI format. That's called foreshadowing. It's a literary trick used to build suspense and anticipation.

There is, in fact, such a thing as an SMB URL. It fits into the general URI syntax7 and can be used to specify files, directories, and other SMB-shared stuff. It is intended as a more portable, and more complete way to specify SMB paths at the application level.

As of this writing, the SMB URL is only documented in an IETF Internet Draft, and is not yet any kind of standard. That hasn't stopped folks from implementing it, though. The SMB URL is supported in a wide variety of products including the KDE and GNOME desktop GUI environments, web browsers such as Galeon and Konqueror, and Open Source CIFS projects like jCIFS and libsmbclient (the latter is included with Samba). Thursby Software and Apple Computer also make use of the SMB URL in their commercial CIFS implementations.

That's good news for CIFS implementors because it means that there is an accepted, cross-platform way to identify SMB-shared resources, both within LANs and across the Internet.

2.2.5 Was That Trip Really Necessary?

Our quick UNC tour provided an introduction to some of the basic concepts, and annoyances, of SMB. We will expand upon those ideas as we dig more deeply into the protocol. The UNC format itself is also important for a variety of reasons, both historical and practical. Not least among these is that UNC strings are used within some of the SMB messages that cross the wire.

The SMB URL format is equally significant. It is portable, flexible, and gaining in popularity. It will also form the basis for examples given later in the text. If you are implementing an SMB client, you will most likely want to have some convention for identifying resources. You could invent your own, or use UNC, but the SMB URL is probably your best option.


2.3 First Contact: Reaching the Server

Getting there is half the fun.
-- Unknown
  

We are approaching this thing in layers. A little history, a quick introductory tour...and now this. It may seem like a bit of a diversion, but the goal in this section is to figure out how a client finds the server and initiates a connection. No, we're not dealing with SMB protocol yet, but we can't send SMB messages until we can talk to a server.

Think of a telephone call. If you want to call your cousin in New York the first thing you need to know is the telephone number. You could ask your uncle for the number or look it up in the telephone book, or perhaps you have it written on a scrap of paper somewhere in the kitchen with your favorite tofu recipes. If you dial the wrong number you will annoy some guy in a gas station in Brooklyn. When you dial the correct number, the underlying system will go through a complex process to set up the connection so that you can start talking to your cousin (or, more likely, to the answering machine).

...and if you want to connect with an SMB server you might need to resolve a NetBIOS or DNS name to an IP address. Once you have the address, you can attempt to open a session with the server.

Consider this simple SMB URL:

smb://server/

From the user's perspective, that should be enough to build an initial connection to an SMB server named "server".

From an implementation point of view, the first thing to do with this example is to parse out the "server" substring. In URI parlance, the field we are looking for is called the "host non-terminal"8, and it contains the name or address of the server to which we are trying to connect. Our term for the parsed-out string is "Server Identifier". Once we have extracted it, the next thing we need to know how to do is interpret it so that we can use the information to create the session.

2.3.1 Interpreting the Server Identifier

The SMB URL format supports the use of three different identifier types in the host field. We went over them briefly before. They are the IP address, DNS name, or NetBIOS name of the destination. Our next task is to figure out which is which.
 

If you want something done right
you have to do it yourself.
-- Well-known axiom
  

Presentation is everything, and it turns out that the code for interpreting the Server Identifier is verbose and tedious. Most of the busywork for handling NetBIOS names was covered in section 1, and there are plenty of tools for dealing with IP addresses and DNS names, so to save time we will describe how to interpret and resolve the address (and let you write the code yourself9).

It could be an IP address.

Check the syntax of the input to determine whether it is a valid representation of an IP address. Do this test first. It is quick, and does not involve sending any queries out over the network. The inet_aton() function, common on Unix-like operating systems, does the job nicely for the four-byte IPv4 addresses used today.

IP version 6 (IPv6) addresses are different. They are longer, harder for a human to read, and potentially more complicated to parse out. Fortunately, when used in URLs they are always contained within square brackets, as in the following example:

smb://[fe80::240:f4ff:fe1f:8243]/

The square brackets are reserved characters, used specifically for this purpose10. They make it easy to identify an IPv6 IP address. Once identified, the IPv6 address can be converted into its internal format by the inet_pton() function, which is now supported by many systems.

You are likely to be eaten
by a grue.
-- Zork, Marc Blank and
David Lebling, InfoCom

  
Note that it is, in theory, possible to register a NetBIOS name that looks exactly like an IP address. What's worse is that it might not be the same as the IP address of the node that registered it. That's nasty. Anyone who would do such a thing should have their keyboard taken away. It is probably not important to handle such situations. Defensive programming practices would suggest being prepared, but in this case the perpetrators deserve the troubles they cause for themselves.

It could be a NetBIOS Name.

If the Server Identifier isn't an IP address, it could be a NetBIOS name. To see if this is the case, the first step is to look for a dot ('.'). The SMB URL format does not allow un-escaped dots to appear in the NetBIOS name itself, so if there is a dot character in the raw string then consider the rest of the string to be a Scope ID. For example:

smb://my%2Enode.scope/

is made up of the NetBIOS name "MY.NODE" and the Scope ID "SCOPE". (The URL escape sequence for encoding a dot is %2E.)

Once the string has been parsed into its NetBIOS Name and Scope ID components, the next thing to do is to send an NBT Name Query. Always use a suffix value of 0x20, which is the prescribed suffix for SMB services. The handling of the query depends, of course, on whether the client is a B, M, P, or H node. For anything other than a B node, the IP address of the NBNS is required. Most client implementations keep such information in some form of configuration file or database.

If a positive response is received, keep track of the NetBIOS name and returned IP address. You will need them in order to connect to the server.

It could be a DNS name.

If the Server Identifier is neither an IP address nor a NetBIOS name, try DNS name resolution. The gethostbyname() function is commonly used to resolve DNS names to IP addresses, but be warned that this is a blocking function. It may take quite a while for it to do its job, and your program will do nothing in the mean time11. That is one reason that it is typically the last thing to try.

That is how to go about determining which kind of Server Identifier you've been given. Isn't overloading fun? Now you see why the code for handling all of this is tedious and verbose. It really is not very difficult, though, it's just that it takes a bit of work to get it all coded up.

2.3.2 The Destination Port

Port 139 is for NBT, and port 445 is for raw TCP--good rules of thumb. Recall, though, that the NBT Session Service provides a mechanism for redirection. In addition, some security protocols use high-numbered ports to tunnel SMB connections through firewalls. That means that the use of non-standard ports should be supported on the client side.

The SMB URL allows the specification of a destination port number, like so:

smb://server:1928/

Once again, that fits into standard URI syntax. If you spend any time using a web browser, the port field should be familiar.

What this all means, however, is that the port number does not always indicate which transport should be used. Rather the opposite; if the port number is not specified, the default port depends upon the transport. Knowing which transport to choose is, once again, something that requires some figuring out.

2.3.3 Transport Discovery

As has been stated previously, we are only considering the NBT and naked TCP transports. Both of these are IP-based and the behavior of SMB over these two is nearly identical, so it does not seem as though separating them would be very important...but this is CIFS we're talking about.

The crux of the problem is whether or not the NBT SESSION REQUEST message is required. If the server is expecting correct NBT semantics, then we will need to find a valid NetBIOS name to place into the CALLED NAME field. This is a complicated process, involving a lot of trial-and-error. The recipe presented below is only one way to go about it. A good chef knows how to adjust the ingredients and choose seasonings to get the desired result. This is as much an art as it is a science.

2.3.3.1 Run Naked

Running naked is probably the easiest transport test to try first. The procedure is tasteful and dignified: simply assume that the server is expecting raw TCP transport. Open a TCP connection to port 445 on the server, but do not send an NBT SESSION REQUEST--just start sending SMB messages and see if that works. There are four possible results from this test:

  1. If nothing is listening on port 445 at the server, the TCP connection will fail. If that happens, the client can fall back to using NBT on port 139.
  1. If a non-SMB service is running on the destination port one end or the other will (hopefully) figure out that the messages being exchanged are incomprehensible, and the connection will be dropped. Again, the fall-back is to try NBT on port 139.
  1. The remote end may be expecting NBT transport. This should never happen when talking to port 445, but defensive programming practices suggest being prepared. If the server requires NBT transport then it will probably reply to the initial SMB message by sending an NBT NEGATIVE SESSION RESPONSE.
  1. The connection might, after all, succeed.

All of the above applies if the user did not specify a non-standard port number. If the input looks more like this:

smb://server:2891/

...then the option of falling back to NBT on port 139 is excluded. In addition, there is no way to guess which transport type should be used if a port number other than 139 or 445 is specified. (In theory, it is also possible to run NBT transport on port 445 and naked transport on port 139. If you catch anyone doing such a twisted thing you should probably notify the authorities.)

Fortunately, Windows systems (Windows95, '98, and W2K were tested) return an NBT NEGATIVE SESSION RESPONSE if they get naked semantics on an NBT service port. This makes sense, because it lets the client know that NBT semantics are required. Samba's smbd goes one better and simply ignores the lack of a SESSION REQUEST message. Samba's behavior effectively merges the two transport types and makes the distinction between them irrelevant, which simplifies things on the server side and makes life easier for the client.
 

Real Programmers
don't draw flowcharts.
--Unknown
  

The transport discovery process is illustrated using the anachronistic flowchart presented in figure 2.2.

[Figure 2.2]

2.3.3.2 Using the NetBIOS Name

If running naked didn't work, then you will probably need to try NBT transport. Also, back in section 2.3.1 we talked about the different types of Server Identifiers that most implementations support. One of those is the NetBIOS name, and it seems logical to assume that if the Server Identifier is a NetBIOS name then the transport will be NBT.

That's two good reasons to give NBT transport a whirl.

As stated earlier, the critical difference between the raw TCP and NBT transports is that NBT requires the SESSION REQUEST/POSITIVE SESSION RESPONSE exchange before the SMB messages can start flowing. The SESSION REQUEST, in turn, must contain a valid CALLED NAME. If the CALLED NAME is not correct, then some server implementations will reject the connection. (Windows seems to be quite picky, but Samba ignores the CALLED NAME field.)

Finding a valid CALLED NAME is easy if the Server Identifier is a NetBIOS name because, well... because there you are. The NetBIOS name is the correct CALLED NAME. Also, since the Server Identifier was resolved via an NBT Name Query, the server's IP address is known. That's everything you need.

There is one small problem with this scenario that could cause a little trouble: some NBNS servers can be configured to pass NetBIOS name queries through to the DNS system, which means that the DNS--not the NBNS--may have resolved the name to an IP address. That would mean that we have a false-positive and the Server Identifier is not, in fact, a NetBIOS name. If that happens, you could wind up trying to make an NBT connection to a system that isn't running NBT services. (The opposite of the "run naked" test described above.)

Detecting an SMB service that wants naked transport is not as clean and easy as detecting one that wants NBT. In testing, a Windows2000 system running naked TCP transport did not respond at all to an NBT SESSION REQUEST, and the client timed out waiting for the reply. This problem is neatly avoided if naked transport is attempted before NBT transport. Since Samba considers the SESSION REQUEST optional, this kind of transport confusion is not an issue when talking to a Samba server.

2.3.3.3 Reverse-Mapping a NetBIOS Name

Reverse-mapping is the last, desperate means for finding a workable NetBIOS CALLED NAME so that a valid SESSION REQUEST can be sent. Reverse-mapping is also quite common. Your code will need to try this technique if naked transport didn't work and the Server Identifier was a DNS name or IP address--a situation which is not unusual.

As stated before, there is no right way to do reverse-mapping. Fortunately, there are a few almost-right ways to go about it. Here they are:

Try a Node Status query.

Send an NBT NODE STATUS QUERY to the server. If it responds, run through the list of returned names looking for a unique name with a suffix byte value of 0x20. Try using that name as the CALLED NAME when setting up the session. If there are multiple names with a suffix value of 0x20, try them in series until you get a POSITIVE SESSION RESPONSE (or until they all fail).

Stop laughing. It gets better.

Try using the Generic CALLED NAME.

This kludge was introduced in Windows NT4 and has been adopted by many other implementations. It is fairly common, but not universal.

The generic CALLED NAME is *SMBSERVER<20> (that is, "*SMBSERVER" with a suffix byte value of 0x20). Think of it as an alias, allowing you to connect to the SMB server without knowing its "real", registered NetBIOS name. The *SMBSERVER<20> name starts with an asterisk, which is against the rules, so it is never registered with the NBT Name Service. If you send a unicast Name Query for this name, the destination node should always send a NEGATIVE NAME QUERY RESPONSE in reply (assuming that it is actually running NBT).

A bit awkward but it does work...sometimes. Now for the coup de gras.

 

"Guess," said Marvin.
-- Restaurant at the End
of the Universe
,
Douglas Adams

  
Try Using the DNS Name.

Try using the first label of the DNS name (the hostname of the server) as the CALLED NAME. If you were given an IP address you will need to do a reverse DNS lookup to get a name to play with (we suggested earlier that the DNS name might come in handy). As always, use a suffix byte value of 0x20.

If the first label doesn't work, try the first two labels (retaining the dot) and so on until you have a string that is longer than 15 bytes, at which point you give up.

Yes, there are implementations which actually do this.

If none of those options worked, then it is finally time to send an error message back to the user explaining that the Server Identifier is no good.

Ignorance is Bliss Omission Alert:

We have not fully discussed IPv6.

As it currently stands, NBT doesn't work with IPv6. All of the IP address fields in the NBT messages are four-byte fields, but IPv6 addresses are longer. There has been talk of NetBIOS emulation over IPv6, but if such a thing ever happens (unlikely) it will take a while before the proposal is worked out and accepted.

Unfortunately, when it comes to SMB over IPv6 the author is clueless. It is probably just like SMB over naked transport, except that the addresses are IPv6 addresses.
 

2.3.4 Connecting to the Server

We are still dealing with the transport layer and haven't actually seen any SMBs yet. It is, however, finally time for some code. Listing 2.1 handles the basics of opening the connection with an SMB server. It is example code so, of course, it takes a few shortcuts. For instance, it completely side-steps Server Identifier interpretation and transport discovery (that is, everything we just covered).

[Listing 2.1]

The code in listing 2.1 provides an outline for setting up the session via NBT or raw TCP. With that step behind us, we won't have to deal with the details of the transport layer any longer. Let's run through some code highlights quickly and put all that transport stuff behind us.

Transport:
The program does not attempt to discover which transport to use. As written, it assumes NBT transport. To try naked transport, simply comment out the call to RequestNBTSession() in main().

The Command Line:
Because we are shamelessly avoiding presenting code that interprets Server Identifiers, the example program makes the user do all of the work. The user must enter the NetBIOS name and IP address of the server. Entering a destination port number is optional.

The name entered on the command line will be used as the CALLED NAME. If the input string begins with an asterisk, the generic *SMBSERVER<20> name will be used instead.

The CALLING NAME (NBT source address):
The program inserts SMBCLIENT<00> as the CALLING NAME.

In a correct implementation, the name should be the client's NetBIOS Machine Name (which is typically the same as the client's DNS hostname) with a suffix byte value if 0x00.

The contents of the CALLING NAME field are not particularly significant. According to the expired Leach/Naik CIFS Internet Draft, the same name from the same IP address is supposed to represent the same client...but you knew that. Samba can make use of the CALLING NAME via a macro in the smb.conf configuration file. The macro is used for all sorts of things, including generating per-client log files.


We leave this as an
exercise for the reader.
-- Unknown
  
Transporting SMBs:
A key feature of this program is the line within main() which reads:

/* ** Do real work here. ** */

That's where the SMB stuff is supposed to happen. At that point in the code, the session has been established on top of the transport layer and it is time to start moving those Server Message Blocks.

Use the program above as a starting point for building your own SMB client utility. Add a parser capable of dissecting the UNC or SMB URL format, and then code up Server Identifier resolution and transport discovery, as described above. When you have all of that put together, you will have completed the foundation of your SMB client.


2.4 SMB in its Natural Habitat

Are we there yet?
-- Kids in the back seat.
  

We have spent a lot of time and effort preparing for this expedition, and we are finally ready to venture into SMB territory. It can be a treacherous journey, though, so before we push ahead we should re-check our equipment.

Test Server

If you are going to start testing, you have to have something at which to fling packets. When choosing a test server, keep in mind that SMB has grown and changed and evolved and adapted and mutated over the years. You want a server that can be configured to meet your testing needs. Samba, of course, is highly configurable. If you know your way around the Windows Registry, you may have luck with those systems as well. In particular, you probably want to avoid strong password encryption during the initial stages. Handling authentication is a big chunk of work, and it is best to try and reduce the number of simultaneous problems to a manageable few.

Repetitive Terminology Redundancy Notification Alert Alert:

The SMB server software running on a file server node is known as the "File Server Service", or just "Server Service".

When running on top of NBT, the Server Service always registers a NetBIOS name composed of the Machine Name and, of course, a suffix value of 0x20. The Machine Name is typically--but not necessarily--the same as the DNS host name.
 

Test Client

The next thing you will want is a packet flinger. That is, a working client. You need this for testing, and to compare behavior when debugging your own client. Samba offers the smbclient utility, and jCIFS comes with a variety of example programs. Windows systems all have SMB support built-in. That's quite a selection from which to choose.

Sniffer

Always your best friend. A good packet analyzer--one with a lot of built-in knowledge of SMB--will be your trusted guide through the SMB jungle.

Documentation

When exploring NBT we relied upon RFC 1001 and RFC 1002 as if they were ancient maps, drawn on cracked and drying parchment, handed down to us by those who had gone before. In the wilds of SMB territory, we will count on the SNIA CIFS Technical Reference as our primary resource. The old X/Open SMB specification and the SMB/CIFS documentation available from Microsoft's FTP server will also come in handy. For the sake of efficiency, from here on out we will be a bit less formal and refer to the SNIA doc as "the SNIA doc", and the X/Open doc as "the X/Open doc".

Yet Another Tasty Terminology Treat Alert:

As we have explained, "SMB" is the Server Message Block protocol. It is also true that "an SMB" is a message. In order to implement SMB, one must learn to send and receive SMBs.

Got that?
 

Keep in mind that the goal of our first trip into the wilds of SMB-land is to become familiar with the terrain and to study SMBs in their natural habitat, so we can learn about their anatomy and behavior. We are not ready yet for a detailed study of SMB innards. That will come later.

2.4.1 Our Very First Live SMBs

We need to capture a few SMBs to see what they look like up close. That means it's time to take a look at the wire and see what's there to be seen. Fire up your protocol analyzer, and then your SMB client. If you can configure your test server to allow anonymous connections (no username, no password) it will simplify things at this stage. If you can't, then things won't run quite as they are shown below. Don't worry, it will be close enough.

For this example, we will use the Exists.java program that comes with jCIFS. It is a very simple utility that does nothing more than verify the existence of the object specified by the given SMB URL string, like so:

 shell

  $ java Exists smb://smedley/home
  smb://smedley/home exists
  $
  

The above shows that we were able to access the HOME share on node SMEDLEY. A similar test can be performed using Samba's smbclient, or with the NET USE command under Windows12:

 DOS Prompt

  C:\> net use \\smedley\home
  The command was completed successfully.

  C:\> net use /d \\smedley\home
  The command was completed successfully.

  C:\>
  

Those simple commands will generate the packets we want to capture and study. Stop your sniffer and take a look at the trace. You should see a chain of events similar to the following:


  No. Source    Destination       Protocol Info
  --- --------  ----------------  -------- ------------------------------
    1 Marika    255.255.255.255   NBNS     Name query
    2 Smedley   Marika            NBNS     Name query response
    3 Marika    Smedley           TCP      34102 > netbios-ssn [SYN]
    4 Smedley   Marika            TCP      netbios-ssn > 34102 [SYN, ACK]
    5 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]
    6 Marika    Smedley           NBSS     Session request
    7 Smedley   Marika            NBSS     Positive session response
    8 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]
    9 Marika    Smedley           SMB      Negotiate Protocol Request
   10 Smedley   Marika            SMB      Negotiate Protocol Response
   11 Marika    Smedley           SMB      Session Setup AndX Request
   12 Smedley   Marika            SMB      Session Setup AndX Response
   13 Marika    Smedley           TCP      34102 > netbios-ssn [FIN, ACK]
   14 Smedley   Marika            TCP      netbios-ssn > 34102 [FIN, ACK]  
   15 Marika    Smedley           TCP      34102 > netbios-ssn [ACK]

The above is edited output from an Ethereal capture13. The packets were generated using the jCIFS Exists utility, as described above. In this case jCIFS was talking to an old Windows95 system, but any SMB server should produce the same or similar results.

The trace is reasonably simple. The first thing that node MARIKA does is send a broadcast NBT Name query to find node SMEDLEY, and SMEDLEY responds. Packets 3, 4, & 5 show the TCP session being created. (Note that netbios-ssn is the descriptive name given to port 139.) Packets 6 and 7 are the NBT SESSION REQUEST/SESSION RESPONSE exchange, and packet #8 is an ACK message, which is just TCP taking care of its business.

Packets 9 and 10 are what we want. These are our first SMBs.

2.4.2 SMB Message Structure

I never metaphor I couldn't mix.
-- Me
  

Figure 2.3 provides an overview of SMB gross anatomy. It shows that SMBs are composed of three basic parts:

  • the Header,
  • the Parameter Block, and
  • the Data Block.

Either or both of the latter two segments may be vestigial (size == 0) in some specimens.

[Figure 2.3]

2.4.2.1 SMB Message Header

Starting at the top, the SMB header is arranged like so:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
0xff 'S' 'M' 'B'
COMMAND STATUS...
...STATUS FLAGS FLAGS2
EXTRA
...
...
TID PID
UID MID

We can also dissect the header using the simple syntax presented previously:

    SMB_HEADER
      {
      PROTOCOL  = "\xffSMB"
      COMMAND   = <SMB Command code (one byte)>
      STATUS    = <Status code>
      FLAGS     = <Old flags>
      FLAGS2    = <New flags>
      EXTRA     = <Sometimes used for additional data>
      TID       = <Tree ID>
      PID       = <Process ID>
      UID       = <User ID>
      MID       = <Multiplex ID>
      }

We now have a pair of perspectives on the header structure. Time for some good, old-fashioned descriptive text.

The PROTOCOL and COMMAND Fields:
The SMB header starts off easily enough. The first four bytes are the protocol identifier string, which always has the same value: "\xffSMB". It's not particularly clear14 why this is included in the SMBs but there it is, and it's in all of them.

The next byte is the COMMAND field, which tells us what kind of SMB we are looking at. In the NEGOTIATE PROTOCOL messages captured above, the COMMAND field has a value of 0x72 (aka. SMB_COM_NEGOTIATE). The SNIA doc has a list of the available command codes. That list is probably complete, but this is SMB we are talking about so you never know...

The STATUS Field:
Now things start to get surreally interesting.

DOS and OS/2 use 16-bit error codes, grouped into classes. To accommodate these codes, the STATUS field is subdivided like so:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
ErrorClass <reserved> ErrorCode

WindowsNT introduced a new set of 32-bit error codes, known as NT_STATUS codes. These use the entire status field to hold the NT_Status value:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
NT_Status

Be afraid. Be very afraid.
-- Veronica Quaife
(Geena Davis)
The Fly (1986)
  
With two error code formats from which to choose, the client and server must confer to decide which set will be used. How that is done will be explained later on. Error code handling is a large-sized topic with extra sauce.

FLAGS and FLAGS2:
Look around the Web for a copy of a document called COREP.TXT15. This is probably the earliest SMB documentation that is also easy to find. In COREP.TXT, you can see that the original SMB header layout reserved fifteen bytes following the error code field. That 15 bytes has, over time, been carved up for a variety of uses.

The first formerly-reserved byte is now known as the FLAGS field. The bits of the FLAGS field are used to modify the interpretation of the SMB. For example, the highest-order bit is used to indicate whether the SMB is a request (0) or a response (1).

Following the FLAGS field is the two-byte FLAGS2 field. This set of bits is used to indicate the use of newer features, such as the 32-bit NT_STATUS error codes.

The EXTRA Field:
The EXTRA field takes up most of the remaining formerly-reserved bytes. It contains two subfields, as shown below:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
PidHigh Signature...
...Signature...
...Signature <unused>

The PidHigh subfield is used to accommodate systems that have 32-bit Process IDs. The original SMB header format only had room for 16-bit PIDs (in the PID field, described further on).

The 8-byte Signature subfield is for SMB message signing, which uses cryptography to protect against a variety of attacks that might be tried by badguys hoping to gain unauthorized access to SMB shares.

When not in use, these fields must be filled with zeros.

TID, PID, UID, and MID:
TID The "Tree ID".
In SMB, a share name typically represents a directory or subdirectory tree on the server. The SMB used to open a share is called a "Tree Connect" because it allows the client to connect to the shared [sub]directory tree. That's where the name comes from. The TID field is used to identify connections to shares once they have been established.
 
PID The "Process ID".
This value is set by the client, and is intended as an identifier for the process sending the SMB request. The most important thing to note regarding the PID is that file locking and access modes are maintained relative to the value in this field.

The PID is 16 bits wide, but it can be extended to 32 bits using the EXTRA.PidHigh field described earlier.
 

UID The "User ID"
This is also known as a VUID (Virtual User ID). It is assigned by the server after the user has authenticated and is valid until the user logs off. It does not need to be the user's actual User ID on the server system. Think of it as a session token assigned to a successful logon.
 
MID The "Multiplex ID".
This is used by the client to keep track of multiple outstanding requests. The server must echo back the MID and the PID provided in the client request. The client can use those values to make sure that the reply is matched up to the correct request.

The TID and [V]UID are assigned and managed by the server, while the PID and MID are assigned by the client. It is important to note that the values in these fields do not necessarily have any meaning outside of the SMB connection. The PID, for example, does not need to be the actual ID of the client process. The client and server assign values to these fields in order to keep track of context, and that's all.

2.4.2.2 SMB Message Parameters

In the middle of the SMB message are two fields labeled WordCount and Words[]. For our purposes, we will identify these two fields as being the SMB_PARAMETERS block, which looks like this:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
. . .
WordCount Words...

    SMB_PARAMETERS
      {
      WordCount         = <Number of words in the Words array>
      Words[WordCount]  = <SMB parameters; varies with SMB command>
      }

The Words field is simply a block of data that is 2 × WordCount bytes in length. Perhaps at one time the intention was that it would contain only two-byte values (a quick look at COREP.TXT suggests that this is the case). In practice, all sorts of stuff is thrown in there.

Each SMB message type (species?) has a different record structure that is carried in the Words block. Think of that structure as representing the parameters passed to a function (the function identified by the SMB command code listed in the header).

2.4.2.3 SMB Message Data

Following the SMB_PARAMETERS is another block of data, the content of which also varies in structure on a per-SMB basis:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
. . .
ByteCount  Bytes... 

    SMB_DATA
      {
      ByteCount        = <Number of bytes in the Bytes field>
      Bytes[ByteCount] = <Contents varies with SMB command>
      }

The Bytes field holds the data to be manipulated. For example, it may contain the data retrieved in response to a READ operation, or the data to be written by a WRITE operation. In many cases, though, the SMB_DATA block is just another record structure with several subfields. Through time, SMB has evolved lazily and any functional distinction that may have separated the Parameter and Data blocks has been blurred.

Note that the SMB_DATA.ByteCount field is an unsigned short, while the SMB_PARAMETERS.WordCount field is an unsigned byte. That means that the SMB_PARAMETERS.Words block is limited in length to 510 bytes (2 × 255), while SMB_DATA.Bytes may be as much as 65535 bytes in length. If you add all that up, and then add in the SMB_PARAMETERS.WordCount field, the SMB_DATA.ByteCount field, and the size of the header, you will find that the whole thing fits easily into the 217-1 bytes made available in the NBT SESSION MESSAGE header.

2.4.3 Case in Point: NEGOTIATE PROTOCOL

Now that we have an overview of the structure of SMB messages, we can take a closer look at our live specimen. Remember packets 9 and 10 from the capture we made earlier? They show a NEGOTIATE PROTOCOL exchange. Let's get out the tweezers, the pocket knife, & dad's hammer and see what's inside.

    NEGOTIATE_PROTOCOL_REQUEST
      {
      SMB_HEADER
        {
        PROTOCOL  = "\xffSMB"
        COMMAND   = SMB_COM_NEGOTIATE (0x72)
        STATUS
          {
          ErrorClass = 0x00   (Success)
          ErrorCode  = 0x0000 (No Error)
          }
        FLAGS     = 0x18 (Pathnames are case-insensitive)
        FLAGS2    = 0x8001 (Unicode and long filename support)
        EXTRA
          {
          PidHigh    = 0x0000
          Signature  = 0 (all bytes zero filled)
          }
        TID       = 0 (Not yet known)
        PID       = <Client Process ID>
        UID       = 0 (Not yet known)
        MID       = 2 (often 0 or 1, but varies per OS)
        }
      SMB_PARAMETERS
        {
        WordCount = 0
        Words     = <empty>
        }
      SMB_DATA
        {
        ByteCount = 12
        Bytes
          {
          BufferFormat = 0x02 (Dialect)
          Name         = "NT LM 0.12" (nul terminated)
          }
        }
      }

The breakdown of packet 9 shows the SMB NEGOTIATE PROTOCOL REQUEST as sent by the jCIFS Exists utility. Other clients will use slightly different values, but they are all variations on the same theme. Some features worth noting:

  • The COMMAND field has a value of 0x72 (SMB_COM_NEGOTIATE). That's how we know that this is a NEGOTIATE PROTOCOL message. We also know that it is a REQUEST rather than a RESPONSE because the highest-order bit in the FLAGS field has a value of zero (0).

  • The STATUS field is all zeros at this point because we haven't yet done anything to cause an error. Also, the error messages are presented in the older DOS format. This is because jCIFS is indicating, via a bit in the FLAGS2 field, that it is using the DOS format. We'll dig into those bits later on.

  • Several fields (the EXTRA.Signature, the TID, and the UID, to name a few) contain zeros. The content of these fields has not yet been determined, and they may or may not be filled in later on. It all depends upon the types of SMB requests that are issued. Stay tuned.

  • In this particular SMB the Parameter block is empty and all of the useful information is being carried in the Data block. In contrast, the response packet from the server (packet 10) makes use of both the Parameter and Data blocks (assuming that there are no errors). See for yourself by looking at the NEGOTIATE PROTOCOL RESPONSE in your capture.

    The Data block in the request contains the list of protocols that the client is able to speak. jCIFS only knows one dialect, so only one name is listed in the message above. As you can see, jCIFS implements the "NT LM 0.12" dialect (the most recent and widely supported as of this writing). Other clients, such as Samba's smbclient, support a longer list of dialects.

2.4.4 The AndX Mutation

In the trace given above, Ethereal has identified packets 11 and 12 as being a SESSION SETUP ANDX exchange16. The term "ANDX" at the end of the names indicates that these messages belong to a curious class of creatures known as "AndX messages". SMB AndX messages are actually several SMBs combined into a single symbiotic packet as shown in figure 2.4. It is an efficient mutation.
 

<tpot> shouldn't that be an AntX?
-- Tim Potter on IRC
  

[Figure 2.4]

AndX messages work something like a linked list. Each Parameter block in an AndX message begins with the following structure:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
AndXCommand <reserved> AndXOffset

The AndXCommand field provides the SMB command code for the next AndX block in the list (not the current one). The AndXOffset contains the byte index, relative to the start of the SMB header, of that next AndX block--think of it as a pointer. Since the AndXOffset value is independent of the SMB_PARAMETERS.WordCount and SMB_DATA.ByteCount values, it is possible to provide padding between the AndX blocks as shown in figure 2.5.

[Figure 2.5]

Now that we have a general idea of what an SMB AndX message looks like we are ready to dissect packet 11. It looks like this:

    SESSION_SETUP_ANDX_REQUEST
      {
      SMB_HEADER
        {
        PROTOCOL  = "\xffSMB"
        COMMAND   = SMB_COM_SESSION_SETUP_ANDX (0x73)
        STATUS
          {
          ErrorClass = 0x00   (Success)
          ErrorCode  = 0x0000 (No Error)
          }
        FLAGS     = 0x18 (Pathnames are case-insensitive)
        FLAGS2    = 0x0001 (Long filename support)
        EXTRA
          {
          PidHigh    = 0x0000
          Signature  = 0 (all bytes zero filled)
          }
        TID       = 0 (Not yet known)
        PID       = <Client Process ID>
        UID       = 0 (Not yet known)
        MID       = 2 (often 0 or 1, but varies per OS)
        }
      ANDX_BLOCK[0] (Session Setup AndX Request)
        {
        SMB_PARAMETERS
          {
          WordCount     = 13
          AndXCommand   = SMB_COM_TREE_CONNECT_ANDX (0x75)
          AndXOffset    = 79
          MaxBufferSize = 1300
          MaxMpxCount   = 2
          VcNumber      = 1
          SessionKey    = 0
          CaseInsensitivePasswordLength = 0
          CaseSensitivePasswordLength   = 0
          Capabilities  = 0x00000014
          }
        SMB_DATA
          {
          ByteCount     = 20
          AccountName   = "GUEST"
          PrimaryDomain = "?"
          NativeOS      = "Linux"
          NativeLanMan  = "jCIFS"
          }
        }
      ANDX_BLOCK[1] (Tree Connect AndX Request)
        {
        SMB_PARAMETERS
          {
          WordCount       = 4
          AndXCommand     = SMB_COM_NONE (0xFF)
          AndXOffset      = 0
          Flags           = 0x0000
          PasswordLength  = 1
          }
        SMB_DATA
          {
          ByteCount       = 22
          Password        = ""
          Path            = "\\SMEDLEY\HOME"
          Service         = "?????"  (yes, really)
          }
        }
      }

There is a lot of information in that message, but we are not yet ready to dig into the details. There is just too much to cover all of it at once. Our goals right now are simply to highlight the workings of the AndX blocks, and to provide a glimpse inside the SESSION SETUP ANDX & TREE CONNECT ANDX sub-messages so that we will have something to talk about later on.

The block labeled ANDX_BLOCK[0] is the body of the SESSION SETUP REQUEST, and ANDX_BLOCK[1] contains the TREE CONNECT REQUEST. Note that the AndXCommand field in the final AndX block is given a value of 0xFF. This, in addition to the zero offset in the AndXOffset field, indicates the end of the AndX list.

2.4.5 The Flow of Conversation

SMB conversations start after the session has been established via the transport layer. As a rule, the client always speaks first. Clients send requests, servers respond, and that's the way SMB is supposed to work. This is a hard-and-fast rule which means, of course, that there is an exception. Fortunately, we can (and will) put off talking about that exception until we talk about Opportunistic Locks (OpLocks).

The NEGOTIATE PROTOCOL REQUEST/RESPONSE is always the first SMB exchange in the conversation. The client and server need to know what language to speak before they can say anything else. This is also a hard-and-fast rule, but there are no exceptions (which is an exception to the rule that all hard-and-fast rules have exceptions).

Once the dialect has been selected, the next formality is to establish an SMB session using the SMB SESSION SETUP REQUEST message. We keep running into terminology twists, and here we have yet another. The SMB SESSION SETUP exchange sets up an SMB session within the NBT or naked TCP session.

Huh?

Well, yes, that's confusing. The problem is that we are talking about two different kinds of sessions here.

  • There is the network session built at layer 5 of the OSI model, on top of the transport layer.

  • There is the user logon session.

Ah, there's a clue! The SESSION SETUP is used to perform authentication and establish a user session with the server17. A quick look at the SESSION SETUP ANDX REQUEST block in the packet above shows that the Exists utility did in fact send a username--the name "GUEST", passed via the AccountName field--to the server.

Once the user session is established, the client may try to connect to a share using a TREE CONNECT SMB. It is a hard-and-fast rule that TREE CONNECT SMBs must follow the SESSION SETUP. There is an exception to this as well, which we will cover when we get to share-mode vs. user-mode authentication.

[Figure 2.6]

Figure 2.6 shows the right way to start an SMB conversation. Combining the SESSION SETUP ANDX and TREE CONNECT ANDX SMBs into a single AndX message is optional (jCIFS' Exists does, but Samba's smbclient doesn't). Once the conversation has been initiated using the above sequence, the client is free to improvise.

2.4.6 A Little More Code

There is another small detail you may have noticed while studying the captured SMB packets--or perhaps you remember this from one of the !Alert boxes in the NBT section: SMBs are written using little-endian byte order. If your target platform is big-endian, or if you want your code to be portable to big-endian systems, you will need to be able to handle the conversion between host and SMB byte order.

The htonl(), htons(), ntohl(), and ntohs() functions won't help us here. They convert between host and network order. We need to be able to convert between host and SMB order (and SMB order is definitely not the same as network order).

So, to solve the problem, we need a little bit of code, which is presented here mostly to get it out of the way so that we won't have to bother with it when we are dealing with more complex issues. The functions in Listing 2.2 read short and long integer values directly from incoming message buffers and write them directly to outgoing message buffers.

[Listing 2.2]

2.4.7 Take a Break

Our field trip into SMB territory is now over. We have covered a lot of ground, collected samples, and taken a look at SMBs in the wild. Our next step will be doing the lab work, studying our specimens under a microscope. It is time to take a break, relax, and reflect on what we have learned so far.

Time for a cup of tea.

In the next section we will go back over the SMB header in a lot more detail with the goal of explaining some of the key concepts that we have only touched on so far. You will probably want to be well rested and in a good mood for that.


2.5 The SMB Header in Detail

During that first expedition into SMB territory we continually deferred studying the finer details of the SMB header, among other things. We were trying to cover the general concepts, but now we need to dig into the guts of SMB to see how things really work. Latex gloves and lab coats required.

Let's start by revisiting the header layout. Just for review, here's what it looks like:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
0xff 'S' 'M' 'B'
COMMAND STATUS...
...STATUS FLAGS FLAGS2
EXTRA
...
...
TID PID
UID MID

The first four bytes are constant, so we won't worry about those. The COMMAND field is fairly straight-forward too; it's just a one byte field containing an SMB command code. The list of available codes is given in section 5.1 of the SNIA doc. The rest of the header is where the fun lies....

2.5.1 The SMB_HEADER.STATUS Field Exposed

Things get interesting starting at the STATUS field. It wouldn't be so bad except for the fact that there are two possible error code formats to consider. There is the DOS & OS/2 format, and then there is the NT_STATUS format. In C-language terms, the STATUS field looks something like this:

    typedef union
      {
      ulong NT_Status;
      struct
        {
        uchar  ErrorClass;
        uchar  reserved;
        ushort ErrorCode;
        } DosError;
      } Status;

From the client side, one way to deal with the split personality problem is to use the DOS codes exclusively18. These are fairly well documented (by SMB standards), and should be supported by all SMB servers. Using DOS codes is probably a good choice, but there is a catch... There are some advanced features which simply don't work unless the client negotiates NT_STATUS codes.
 


 
Rats!
-- Charlie Brown
Peanuts, by Charles Schultz
  

Strange Behavior Alert:

If the client negotiates Extended Security with a Windows2000 server and also negotiates DOS error codes, then the SESSION SETUP ANDX will fail, and return a DOS hardware error. (!?)
    STATUS
      {
      ErrorClass = 0x03   (Hardware Error)
      ErrorCode  = 0x001F (General Error)
      }

Perhaps W2K doesn't know which DOS error to return, and is guessing. The bigger question is: why does this fail at all?

The same SMB conversation with the NT_STATUS capability enabled works just fine. Perhaps, when the coders were coding that piece of code, they assumed that only clients capable of using NT_STATUS codes would also use the Extended Security feature. Perhaps that assumption came from the knowledge that all Windows systems that could handle Extended Security would negotiate NT_STATUS. We can only guess...

This is one of the oddities of SMB, and another fine bit of forensic SMB research by Andrew Bartlett of the Samba Team.
 

Another reason to support NT_STATUS codes is that they provide finer-grained diagnostics, simply because there are more of them defined than there are DOS codes. Samba has a fairly complete list of the known NT_STATUS codes, which can be found in the samba/source/include/nterr.h file in the Samba distribution. The list of DOS codes is in doserr.h in the same directory.

We have already described the structure of the DOS error codes. NT_STATUS codes also have a structure, and it looks like this:

0 1 2 3 4 5 6 7 8 9 1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
Level <reserved> Facility ErrorCode

In testing, it appears as though the Facility field is always set to zero (FACILITY_NULL) for SMB errors. That leaves us with the Level and ErrorCode fields to provide variety ... and, as we have suggested, there is quite a bit of variety. Samba's nterr.h file lists over 500 NT_STATUS codes, while doserr.h lists only 99 (and some of those are repeats).

Level is one of the following:

00 == Success
01 == Information
10 == Warning
11 == Error

Since the next two bits (the <reserved> bits) are always zero, the highest-order nibble will have one of the following values: 0x0, 0x4, 0x8, or 0xC. At the other end of the longword, the ErrorCode is read as an unsigned short (just like the DOS ErrorCode field).

The availability of Samba's list of NT_STATUS codes makes things easy. It took a bit of doing to generate that list, however, as most of the codes are not documented in an accessible form. Andrew Tridgell described the method below, which he used to generate a list of valid NT_STATUS codes. His results were used to create the nterr.h file used in Samba.


Tridge's Trick:

  1. Modify the source of Samba's smbd daemon so that whenever you try to delete a file that matches a specific pattern it will return an NT_STATUS error code. (Do this on a testing copy, of course. This hack is not meant for production.) For example, return an error whenever the filename to be deleted matches "STATUS_CODE_HACK_FILENAME.*". Another thing to do is to include the specific error number as the filename extension, so that the name

    STATUS_CODE_HACK_FILENAME.0xC000001D

    will cause Samba to return an NT_STATUS code of 0xC000001D.

  1. Create the files on the server side first so you have something to delete. That is easily done with a shell script, such as this:
        #!/bin/bash
        #
        i=0;j=256
        while [ $i -lt $j ]
        do
          touch `printf "STATUS_CODE_HACK_FILENAME.0xC000%.4x" $i`
          i=`expr $i + 1`
        done

    Change the values of i and j to generate different ranges.

  1. On a WindowsNT or Windows2000 system, mount the Samba share containing the generated STATUS_CODE_HACK* files. Next, open a DOS command shell and, one by one, delete the files. For each file, Samba should return the specified NT_STATUS code...and Windows will interpret the code and tell you what it means. If the code is not defined, Windows will tell you that as well.

  1. If you capture the delete transactions using Microsoft's NetMon tool, it will show you the symbolic names that Microsoft uses for the NT_STATUS codes.
     

Okay, now for the next conundrum...

Servers have it tougher than clients. Consider a server that needs to respond to one client using DOS error codes, and to another client using NT_STATUS codes. That's bad enough, but consider what happens when that server needs to query yet another server in order to complete some operation. For example, a file server might need to contact a Domain Controller in order to authenticate the user.

The problem is that, no matter which STATUS format the Domain Controller uses when responding to the file server, it will be the wrong format for one of the clients. To solve this problem the server needs to provide a consistent mapping between DOS and NT_STATUS codes.

WindowsNT and Windows2000 both have such mappings built-in but, of course, the details are not published (a partial list is given in section 6 of the SNIA doc). Andrew Bartlett used a trick similar to Tridge's in order to generate the required mappings. His setup uses a Samba server running as a Primary Domain Controller (PDC), and a Windows2000 system providing SMB file services. A third system, running Samba's smbtorture testing utility, acts as the client. When the client system tries to log on to the Windows server, Windows passes the login request to the Samba PDC.

The test works like this:


Andrew Bartlett's Trick:

  1. Modify Samba's authentication code to reject login attempts for any username beginning with "0x". Translate the login name (eg. "0xC000001D") into an NT_STATUS code, and return that in the STATUS field.

  1. Configure smbtorture to negotiate DOS error codes. Aim smbtorture at the W2K SMB server and try logging in as user 0xC0000001, 0xC0000002... etc.

  1. For each login attempt from the client, the Windows SMB server will receive a login failure message from the Samba PDC. Since smbtorture has requested DOS error codes, the W2K pickle-in-the-middle is forced to translate the NT_STATUS values into DOS error codes...and that's how you can discover Microsoft's mapping of NT_STATUS codes to DOS error codes.

The test configuration is shown in figure 2.7.
 

[Figure 2.7]

Andrew's test must be rerun periodically. The mappings have been known to change when Windows service packs are installed. See the file samba/source/libsmb/errormap.c in the Samba distribution for more fun and adventure19.

2.5.2 The FLAGS and FLAGS2 Fields Tell All

Most (but not all) of the bits in the older FLAGS field are of interest only to older servers. They represent features that have been superseded by newer features in newer servers. It would be nice if all of the old stuff would just go away so that we wouldn't have to worry about it. It does seem, in fact, as though this is slowly happening. (Maybe it would be better if the old stuff stayed and the new stuff had never happened. Hmmm...)
 

Duh... dat sounds logical!
-- Baby Huey
Harvey Entertainment
  

In any case, this next table presents the FLAGS bits in order of descending significance--the opposite of the order used in the SNIA doc. English speaking people tend to read from left to right and from top to bottom, so it seems logical (as this book is, more or less, written in English20) to transpose left-to-right order into a top-to-bottom table.

SMB_HEADER.FLAGS
Bit # Name / Bitmask / Values Description
SMB_FLAGS_SERVER_TO_REDIR
0x80

0: request
1: reply

What an awful name!
On DOS, OS/2, and Windows systems, the client is built into the operating system and is called a "redirector", which is where the "SERVER_TO_REDIR" part of the name comes from. Basically, though, this is simply the reply flag.

SMB_FLAGS_REQUEST_BATCH_OPLOCK
0x40

0: Exclusive
1: Batch

Obsolete.
If bit 5 is set, then bit 6 is the "batch OpLock" (aka. OPBATCH) bit. Bit 6 should be clear if bit 5 is clear.

In a request from the client, this bit is used to indicate whether the client wants an exclusive OpLock (0) or a batch OpLock (1). In a response, this bit indicates that the server has granted the batch OpLock.

OpLocks (opportunistic locks) will be covered later.

This bit is only used in the deprecated SMB_COM_OPEN, SMB_COM_CREATE, and SMB_COM_CREATE_NEW SMBs. It should be zero in all other SMBs. The SMB_COM_OPEN_ANDX SMB has a separate set of flags that handle OpLock requests, as does the SMB_COM_NT_CREATE_ANDX SMB.

SMB_FLAGS_REQUEST_OPLOCK
0x20

0: no OpLock
1: OpLock

Obsolete.
This is the "OpLock" bit. If this bit is set in a request, it indicates that the client wants to obtain an OpLock. If set in the reply, it indicates that the server has granted the OpLock.

OpLocks (opportunistic locks) will be covered later.

This bit is only used in the deprecated SMB_COM_OPEN, SMB_COM_CREATE, and SMB_COM_CREATE_NEW SMBs. It should be zero in all other SMBs. The SMB_COM_OPEN_ANDX SMB has a separate set of flags that handle OpLock requests, as does the SMB_COM_NT_CREATE_ANDX SMB. (Sigh.)

SMB_FLAGS_CANONICAL_PATHNAMES
0x10

0: Host format
1: Canonical

Obsolete.
This was supposed to be used to indicate whether or not pathnames in SMB messages were mapped to their "canonical" form. Thing is, it doesn't do much good to write a client or server that doesn't map names to the canonical form (which is basically DOS, OS/2, or Windows compatible). This bit should always be set (1).
SMB_FLAGS_CASELESS_PATHNAMES
0x08

0: case-sensitive
1: caseless

When this bit is clear (0), pathnames should be treated as case-sensitive. When the bit is set, pathnames are considered caseless.

All good in theory. The trouble is that some systems assume caseless pathnames no matter what the state of this bit. Best practice on the client side is to leave this bit set (1) and always assume caseless pathnames.

0x04 <Reserved> (must be zero)
...well, sort of. This bit is clearly listed as "Reserved (must be zero)" in both the SNIA and the X/Open docs, yet the latter contains some odd references to optionally using this bit in conjunction with OpLocks. It's probably a typo. Best bet is to clear it (0) and leave it alone.
SMB_FLAGS_CLIENT_BUF_AVAIL
0x02

0: Not posted
1: Buffer posted

Obsolete.
This was probably useful with other transports, such as NetBEUI. If the client sets this bit, it is telling the server that it has already posted a buffer to receive the server's response. The expired Leach/Naik Internet Draft says that this allows a "send without acknowledgment" from the server.

This bit should be Clear (0) for use with NBT and naked TCP transports.

SMB_FLAGS_SUPPORT_LOCKREAD
0x01

0: Not supported
1: Supported

Obsolete.
If this bit is set in the SMB NEGOTIATE PROTOCOL RESPONSE, then the server supports the deprecated SMB_COM_LOCK_AND_READ and SMB_COM_WRITE_AND_UNLOCK SMBs. Unless you are implementing outdated dialects, this bit should be clear (0).

The NEGOTIATE PROTOCOL REQUEST that we dissected back in section 2.4.3 shows only the SMB_FLAGS_CANONICAL_PATHNAMES and SMB_FLAGS_CASELESS_PATHNAMES bits set, which is probably the best thing for new implementations to do. Testing with other clients may reveal other workable combinations.

Now let's take a look at the newer flags in the FLAGS2 field.

SMB_HEADER.FLAGS2
Bit # Name / Bitmask / Values Description
15  SMB_FLAGS2_UNICODE_STRINGS
0x8000

0: ASCII
1: Unicode

If set (1), this bit indicates that string fields within the SMB message are encoded using a two-byte, little endian Unicode format. The SNIA doc says that the format is UTF-16LE but some folks on the Samba Team say it's really UCS-2LE. The latter is probably correct, but it may not matter as both formats are probably the same for the Basic Multilingual Plane. Doesn't Unicode sound like fun21?

If clear (0), all strings are in 8-bit ASCII format (by which we actually mean 8-bit OEM character set format).

14  SMB_FLAGS2_32BIT_STATUS
0x4000

0: DOS error code
1: NT_STATUS code

Indicates whether the STATUS field is in DOS or NT_STATUS format. This may also be used to help the server guess which format the client prefers before it has actually been negotiated.
13  SMB_FLAGS2_READ_IF_EXECUTE
0x2000

0: Execute != Read
1: Execute confers Read

A quirky little bit this. If set (1), it indicates that execute permission on a file also grants read permission. It is only useful in read operations.
12  SMB_FLAGS2_DFS_PATHNAME
0x1000

0: Normal pathname
1: DFS pathname

This is used with the Distributed File System (DFS), which we haven't covered yet. If this bit is set (1), it indicates that the client knows about DFS, and that the server should resolve any UNC names in the SMB message by looking in the DFS namespace. If this bit is clear (0), the server should not check the DFS namespace.
11  SMB_FLAGS2_EXTENDED_SECURITY
0x0800

0: Normal security
1: Extended security

If set (1), this bit indicates that the sending node understands Extended Security. We'll touch on this again when we discuss authentication.
10  0x0400 <Reserved> (must be zero)
0x0200 <Reserved> (must be zero)
0x0100 <Reserved> (must be zero)
0x0080 <Reserved> (must be zero)
SMB_FLAGS2_IS_LONG_NAME
0x0040

0: 8.3 format
1: Long names

If set (1), then any pathnames that the SMB contains are long pathnames, else the pathnames are in 8.3 format. Any new CIFS implementation really should support long names.
0x0020 <Reserved> (must be zero)
0x0010 <Reserved> (must be zero)
0x0008 <Reserved> (must be zero)
SMB_FLAGS2_SECURITY_SIGNATURE
0x0004

0: No signature
1: Message Authentication Codes

If set, the SMB contains a Message Authentication Code (MAC). The MAC is used to authenticate each packet in a session, to prevent various attacks.
SMB_FLAGS2_EAS
0x0002

0: No EAs
1: Extended Attributes

Indicates that the client understands Extended Attributes.

Note that the SNIA doc talks about "Extended Attributes" and about "Extended File Attributes". These are two completely different concepts. Extended Attributes are a feature of OS/2. They are mentioned in section 1.1.6 (page 2) of the SNIA doc and explained in better detail on page 87. Extended File Attributes are described in section 3.13 (page 30) of the SNIA doc.

The SMB_FLAGS2_EAS bit deals with Extended Attribute support.

SMB_FLAGS2_KNOWS_LONG_NAMES
0x0001

0: Client wants 8.3
1: Long pathnames okay

Set by the client to let the server know that long names are acceptable in the response.

Some of the flags are used to modify the interpretation of the SMB message, while others are used to negotiate features. Some do both. It may take some experimentation to find the safest way to handle these bits. Implementations are not consistent, so new code must be fine-tuned.

You may need to refer back to these tables as we dig further into the details. Note that the constant names listed above may not match those in the SNIA doc, or those in other docs or available source code. There doesn't seem to be a lot of agreement on the names.

2.5.3 EXTRA! EXTRA! Read All About It!

Um, actually we are going to delay covering the EXTRA field yet again. EXTRA.PidHigh will be thrown in with the PID field, and EXTRA.Signature will be handled as part of authentication.

2.5.4 TID and UID: Separated at Birth?

It would seem logical that the [V]UID and TID fields would be somehow related. Both are assigned and managed by the server, and we said before that the SESSION SETUP (where the logon occurs) is supposed to happen before the TREE CONNECT.

Well, put all that aside and pay attention to this little story...


Storytime

Once upon a time there were many, many magic kingdoms taking up office space in cities and towns around the world. In each of these magic kingdoms there were lots of overpaid advisors called VeePees. The VeePees were all jealous of one another, but they were more jealous of the underpaid wizards in the IT department who had power over the data and could work spells and make the numbers come out all right.

Then, one day, evil marketing magicians appeared and convinced the VeePees that they could steal all of the power away from the wizards