TCP(7P)




NAME

     tcp, TCP - Internet Transmission Control Protocol


SYNOPSIS

     #include <sys/socket.h>

     #include <netinet/in.h>

     s = socket(AF_INET, SOCK_STREAM, 0);

     s = socket(AF_INET6, SOCK_STREAM, 0);

     t = t_open("/dev/tcp", O_RDWR);

     t = t_open("/dev/tcp6", O_RDWR);


DESCRIPTION

     TCP is the virtual circuit protocol of the Internet protocol
     family.   It  provides  reliable, flow-controlled, in order,
     two-way transmission of data.  It is a byte-stream  protocol
     layered  above  the  Internet Protocol (IP), or the Internet
     Protocol Version 6 (IPv6), the  Internet  protocol  family's
     internetwork datagram delivery protocol.

     Programs can access  TCP using the  socket  interface  as  a
     SOCK_STREAM socket type, or using the Transport Level Inter-
     face  (TLI)  where  it  supports   the   connection-oriented
     (T_COTS_ORD) service type.

     TCP uses IP's host-level addressing and adds  its  own  per-
     host  collection of "port addresses." The endpoints of a TCP
     connection are identified by the combination  of  an  IP  or
     IPv6  address  and  a TCP port number. Although other proto-
     cols, such as the User Datagram Protocol (UDP), may use  the
     same  host  and port address format, the port space of these
     protocols  is  distinct.  See  inet(7P)  and  inet6(7p)  for
     details  on the common aspects of addressing in the Internet
     protocol family.

     Sockets utilizing TCP  are  either  "active"  or  "passive."
     Active  sockets  initiate  connections  to  passive sockets.
     Both types of sockets must  have  their  local  IP  or  IPv6
     address  and  TCP  port  number bound with the bind(3SOCKET)
     system call after the socket is  created.  By  default,  TCP
     sockets  are  active. A passive socket is created by calling
     the listen(3SOCKET) system call  after  binding  the  socket
     with  bind().  This establishes a queueing parameter for the
     passive socket.  After  this,  connections  to  the  passive
     socket can be received with the accept(3SOCKET) system call.
     Active sockets use the connect(3SOCKET) call  after  binding
     to initiate connections.

     By using the  special  value  INADDR_ANY  with  IP,  or  the
     unspecified  address  (all  zeroes)  with IPv6, the local IP
     address can be left unspecified in the bind() call by either
     active  or passive TCP sockets. This feature is usually used
     if the local address is either  unknown  or  irrelevant.  If
     left unspecified, the local IP or IPv6 address will be bound
     at connection time to the address of the  network  interface
     used to service the connection.

     Once  a  connection  has  been  established,  data  can   be
     exchanged using the read(2) and write(2) system calls.

     Under  most  circumstances,  TCP  sends  data  when  it   is
     presented.  When  outstanding  data  has  not  yet been ack-
     nowledged, TCP gathers small amounts of output to be sent in
     a  single  packet once an acknowledgement has been received.
     For a small number of clients, such as window  systems  that
     send a stream of mouse events which receive no replies, this
     packetization may cause significant  delays.  To  circumvent
     this  problem,   TCP provides a socket-level boolean option,
     TCP_NODELAY. TCP_NODELAY is defined in <netinet/tcp.h>,  and
     is   set   with    setsockopt(3SOCKET)   and   tested   with
     getsockopt(3SOCKET).  The option level for the  setsockopt()
     call   is  the  protocol  number  for  TCP,  available  from
     getprotobyname(3SOCKET).

     Another socket level option, SO_RCVBUF, can be used to  con-
     trol  the  window  that TCP advertises to the peer. IP level
     options may also be used with TCP.  See ip(7P) and ip6(7p).

     TCP provides an urgent data mechanism, which may be  invoked
     using  the  out-of-band  provisions  of  send(3SOCKET).  The
     caller may mark one byte as "urgent" with the  MSG_OOB  flag
     to send(3SOCKET).  This sets an "urgent pointer" pointing to
     this byte in the TCP stream. The receiver on the other  side
     of  the  stream  is  notified of the urgent data by a SIGURG
     signal. The SIOCATMARK  ioctl(2)  request  returns  a  value
     indicating whether the stream is at the urgent mark. Because
     the system never returns data across the urgent  mark  in  a
     single read(2) call, it is possible to advance to the urgent
     data in a simple loop which reads data, testing  the  socket
     with  the  SIOCATMARK  ioctl() request, until it reaches the
     mark.

     Incoming connection requests that include an IP source route
     option  are  noted,  and the reverse source route is used in
     responding.

     A checksum over all data helps  TCP  implement  reliability.
     Using  a  window-based flow control mechanism that makes use
     of  positive  acknowledgements,  sequence  numbers,  and   a
     retransmission   strategy,  TCP  can  usually  recover  when
     datagrams are damaged, delayed, duplicated or delivered  out
     of order by the underlying communication medium.

     If the local TCP receives no acknowledgements from its  peer
     for  a  period  of  time, as would be the case if the remote
     machine crashed, the connection is closed and  an  error  is
     returned  to the user. If the remote machine reboots or oth-
     erwise loses state information about a TCP  connection,  the
     connection is aborted and an error is returned to the user.

     SunOS supports TCP  Extensions  for  High  Performance  (RFC
     1323)  which  includes  the  window  scale  and  time  stamp
     options,  and   Protection  Against  Wrap  Around   Sequence
     Numbers (PAWS). SunOS also supports Selective Acknowledgment
     (SACK)  capabilities  (RFC  2018)  and  Explicit  Congestion
     Notification (ECN) mechanism (RFC 3168).

     Turn on the window scale option  in  one  of  the  following
     ways:

        o  An application can set SO_SNDBUF or SO_RCVBUF size  in
           the  setsockopt()  option  to be larger than 64K. This
           must be done before the program calls listen() or con-
           nect(),  because the window scale option is negotiated
           when the connection is established. Once  the  connec-
           tion  has  been  made,  it is too late to increase the
           send or receive window beyond the default TCP limit of
           64K.

        o  For all applications, use ndd(1M) to modify the confi-
           guration      parameter     tcp_wscale_always.      If
           tcp_wscale_always is set to 1, the window scale option
           will  always   be set when connecting to a remote sys-
           tem.  If tcp_wscale_always  is  0,  the  window  scale
           option  will  be set only if  the user has requested a
           send or receive window larger than  64K.  The  default
           value of tcp_wscale_always is 0.

        o  Regardless of the value of tcp_wscale_always, the win-
           dow  scale option will always be included in a connect
           acknowledgement if the connecting system has used  the
           option.

     Turn on SACK capabilities in the following way:

        o  Use  ndd  to  modify   the   configuration   parameter
           tcp_sack_permitted. If tcp_sack_permitted is set to 0,
           TCP will not accept SACK or send out SACK information.
           If  tcp_sack_permitted  is set to 1, TCP will not ini-
           tiate a connection with SACK permitted option  in  the
           SYN  segment,  but  will  respond  with SACK permitted
           option  in  the   SYN|ACK  segment  if   an   incoming
           connection  request  has  the  SACK  permitted option.
           This means that TCP will only accept SACK  information
           if the other side of the connection also accepts  SACK
           information.  If tcp_sack_permitted is set  to  2,  it
           will  both  initiate  and accept connections with SACK
           information. The default for  tcp_sack_permitted is  2
           (active enabled).

     Turn on TCP ECN mechanism in the following way:

        o  Use  ndd  to  modify   the   configuration   parameter
           tcp_ecn_permitted.  If  tcp_ecn_permitted is set to 0,
           TCP will not negotiate with a peer that  supports  ECN
           mechanism.  If tcp_ecn_permitted is set to 1 when ini-
           tiating a connection, TCP will not tell a peer that it
           supports  ECN  mechanism. However, it will tell a peer
           that it supports ECN mechanism when  accepting  a  new
           incoming connection request if the peer indicates that
           it supports ECN  mechanism  in  the  SYN  segment.  If
           tcp_ecn_permitted  is  set  to 2, in addition to nego-
           tiating with a peer on ECN  mechanism  when  accepting
           connections,  TCP  will  indicate  in the outgoing SYN
           segment that it supports ECN mechanism when TCP  makes
           active   outgoing   connections.   The   default   for
           tcp_ecn_permitted is 1.

     Turn on the time stamp option in the following way:

        o  Use  ndd  to  modify   the   configuration   parameter
           tcp_tstamp_always. If tcp_tstamp_always is 1, the time
           stamp option will always be set when  connecting  to a
           remote   machine.  If  tcp_tstamp_always  is   0,  the
           timestamp option will not be set when connecting  to a
           remote system. The default for tcp_tstamp_always is 0.

        o  Regardless of the value of tcp_tstamp_always, the time
           stamp option will always be included in a connect ack-
           nowledgement (and all succeeding packets)  if the con-
           necting system has used the time stamp option.

     Use the following procedure to turn on the time stamp option
     only  when the window scale option is in effect:

        o  Use   ndd  to  modify  the   configuration   parameter
           tcp_tstamp_if_wscale. Setting  tcp_tstamp_if_wscale to
           1 will cause the time stamp option to be set when con-
           necting to a remote system, if the window scale option
           has been set. If tcp_tstamp_if_wscale is  0, the  time
           stamp  option  will  not  be  set when connecting to a
           remote system. The default   for  tcp_tstamp_if_wscale
           is  1.

     Protection Against Wrap Around Sequence  Numbers  (PAWS)  is
     always used when the time stamp option is set.

     SunOS also supports multiple methods of  generating  initial
     sequence  numbers.   One  of  these  methods is the improved
     technique suggested in  RFC 1948.  We HIGHLY recommend  that
     you set sequence number generation parameters to be as close
     to boot time as possible.   This  prevents  sequence  number
     problems  on  connections that use the same connection-ID as
     ones that used a different sequence number generation.
      The  /etc/init.d/inetinit script  contains  commands  which
     configure  initial  sequence number  generation.  The script
     reads  the  value  contained  in  the   configuration   file
     /etc/default/inetinit to determine which method to use.

     The /etc/default/inetinit file is an unstable interface, and
     may change in future releases.

     TCP may be configured to report some information on  connec-
     tions that terminate by means of an RST packet.  By default,
     no logging is done. If the ndd(1M)  parameter  tcp_trace  is
     set  to  1, then trace data is collected for all new connec-
     tions established after that time.

     The trace data consists of the TCP headers and IP source and
     destination  addresses  of the last few packets sent in each
     direction before RST occurred. Those packets are logged in a
     series  of strlog(9F) calls.  This trace facility has a very
     low overhead, and  so  is  superior  to  such  utilities  as
     snoop(1M)  for  non-intrusive debugging for connections ter-
     minating by means of an RST.


SEE ALSO

     ndd(1M),  ioctl(2),  read(2),   write(2),   accept(3SOCKET),
     bind(3SOCKET),   connect(3SOCKET),  getprotobyname(3SOCKET),
     getsockopt(3SOCKET),    listen(3SOCKET),      send(3SOCKET),
     inet(7P), inet6(7P), ip(7P), ip6(7P)

     Ramakrishnan, K., Floyd, S., Black, D., RFC 3168, The  Addi-
     tion  of  Explicit Congestion Notification (ECN) to IP, Sep-
     tember 2001.

     Mathias,  M.  and  Hahdavi,  J.  Pittsburgh   Supercomputing
     Center;   Ford,  S.  Lawrence  Berkeley National Laboratory;
     Romanow, A. Sun Microsystems, Inc. RFC 2018,  TCP  Selective
     Acknowledgement Options, October 1996.

     Bellovin, S.,  RFC 1948, Defending Against  Sequence  Number
     Attacks, May 1996.

     Jacobson, V., Braden, R., and Borman,  D.,   RFC  1323,  TCP
     Extensions for High Performance, May 1992.
     Postel, Jon, RFC 793, Transmission Control Protocol -  DARPA
     Internet Program Protocol Specification, Network Information
     Center, SRI International, Menlo Park, CA., September 1981.


DIAGNOSTICS

     A socket operation may fail if:

     EISCONN
           A connect() operation was attempted  on  a  socket  on
           which  a  connect()  operation  had  already been per-
           formed.

     ETIMEDOUT
           A connection was dropped due to excessive  retransmis-
           sions.

     ECONNRESET
           The remote peer forced the  connection  to  be  closed
           (usually  because  the  remote  machine has lost state
           information about the connection due to a crash).

     ECONNREFUSED
           The remote peer actively refused connection establish-
           ment  (usually  because no process is listening to the
           port).

     EADDRINUSE
           A bind() operation was attempted on a  socket  with  a
           network  address/port pair that has already been bound
           to another socket.

     EADDRNOTAVAIL
           A bind() operation was attempted on a  socket  with  a
           network address for which no network interface exists.

     EACCES
           A bind() operation was  attempted  with  a  "reserved"
           port  number  and the effective user ID of the process
           was not the privileged user.

     ENOBUFS
           The system ran out of memory for internal data  struc-
           tures.


Man(1) output converted with man2html