Much has been written about the security of SIP- based networks and security of Voice over IP in general, but ultimately good security is a complete architecture, not a single product or protocol. The advent of Fixed/Mobile Convergence and IMS has created some widely accepted standards and has equally highlighted architectural differences in the converging networks. Whilst the overall objective of providing a flexible, secure network and secure services is common, the implementation details differ from network to network.
Even within the standards themselves there is sometimes an assumption of trust which may not in reality exist. For example, the IMS definition within TISPAN assumes that the signalling elements are able to handle excessive signalling rates and badly formed signalling messages. In reality these elements are designed to process sessions, handling attacks at the same time may not be the best use of the equipment. In a data-centric network we would expect to see servers ringed by Firewalls and Intrusion Detection and Prevention systems, so why would we build a media-centric network any other way?