[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: draft-lehtovirta-rtpsec-infra-00



> -----Original Message-----
> From: Xu Chen [mailto:chenxu0128@xxxxxxxxxx] 
> Sent: Thursday, March 08, 2007 6:25 PM
> To: 'Dan Wing'
> Cc: ietf-rtpsec@xxxxxxxxxxxx; 'Vesa Lehtovirta (JO/LMF)'
> Subject: RE: draft-lehtovirta-rtpsec-infra-00
> 
> hi, Dan

Hi.

> I have some questions about section 6. In this case, should we have a
> flow-based media gateway to route audio stream to audio device, like a
> handset, and video stream to video device, like a PDA. 

Leaving SRTP out of the picture for a moment, let's have the audio device be
2.2.2.2 and the video device be 3.3.3.3.  

If those devices initiate a call, their SDP offer would look like this:

      v=0
      o=- 2890844526 2890842807 IN IP4 host.example.com
      t=0 0
      m=audio 1234 RTP/AVP 0
      c=IN IP4 2.2.2.2
      m=video 1234 RTP/AVP 31
      c=IN IP4 3.3.3.3

With SRTP and a=key-mgmt (MIKEY), you'd have two a=key-mgmt lines (one for
each m= line).  Using the above example, you would now have:

      v=0
      o=- 2890844526 2890842807 IN IP4 host.example.com
      t=0 0
      m=audio 1234 RTP/SAVP 0
      c=IN IP4 2.2.2.2
      a=key-mgmt ...
      m=video 1234 RTP/SAVP 31
      c=IN IP4 3.3.3.3
      a=key-mgmt ...

Same thing if you're using a=crypto:

      v=0
      o=- 2890844526 2890842807 IN IP4 host.example.com
      t=0 0
      m=audio 1234 RTP/SAVP 0
      c=IN IP4 2.2.2.2
      a=crypto ...
      m=video 1234 RTP/SAVP 31
      c=IN IP4 3.3.3.3
      a=crypto ...

Taking ZRTP-03, you'd have:

      v=0
      o=- 2890844526 2890842807 IN IP4 host.example.com
      t=0 0
      m=audio 1234 RTP/AVP 0
      c=IN IP4 2.2.2.2
      a=zrtp-zid 2839
      m=video 1234 RTP/AVP 31
      c=IN IP4 3.3.3.3
      a=zrtp-zid 389531

DTLS-SRTP works similarly, using a=fingerprint.

So I think an offer works fine already.  There is the difficulty in which
device (the video or audio device) actually sends the offer and how it
controls and coordinates with the slave device.  That could be via SIP, I
suppose, or via some sort of control protocol (H.248, I suppose, or
something proprietary).


Receiving an SDP offer in an Invite could be handled similarly, with a
similar requirement for some sort of protocol between the audio and the
video devices for them to learn the other's UDP port, codecs, and whatever
else needs to be in the SDP itself.

> If media gateway is required, where should SRTP stream terminate? 

By 'media gateway', are you referring to a device that is interworking
between SRTP and RTP, or are you referring to a device like a NAT or
an SBC that is rewriting the transport address (IP address and UDP
port)?

I just finished reviewing zrtp-03, and it contains a very nice write-up
of how such boxes should interwork with media-path key management 
protocols (such as ZRTP).  See
http://tools.ietf.org/html/draft-zimmermann-avt-zrtp-03#section-14

> and is this a forking problem?

I think you could use forking to split incoming calls to the audio device
and the video device, but I doubt the originating device -- which might be
unfamiliar with receiving two SDP answers from two different branches -- one
of those SDP answers would indicate support for audio and another would
indicate support for video.

However, outgoing calls would need to have one SDP that describes the offer,
so I think you cannot escape the requirement for a protocol between the
devices.


I admit, however, this is outside of my familiarity with SIP and SDP.  You
might see if Paul Kyzivat (CC'd) or the SIP working group have suggestions
on how this should be appropriately signaled.  

In any event, I think all of the key exchange mechanisms work fine with
this; the complication is really around SDP and SIP signaling, rather than
SRTP keying.

-d


> Best regards,
> Xu
> 
> 
>  >     >  6.  Termination of media streams in different devices
> >     >
> >     >     In some cases, different media streams might be
> >     >     terminated in different devices.  For example, the
> >     >     video part of a multimedia session could terminate in
> >     >     one device, while the audio part would terminate in
> >     >     another device.  It should be possible to set up media
> >     >     security efficiently in such scenarios.
> > 
> > I agree this is a requirement, but I believe all of the existing
> > key exchange mechanisms support this requirement, as do all of
> > the new generation of key exchange mechanisms.  Do you feel 
> > they are lacking in this support?
> > 
> > -d
>