For me, Second Life has always been more about human communication, collaboration, and spirit than about technology. When I talk to Residents about their experiences, one of the recurring themes is improving our communication methods. For so many, Second Life is a place to make and meet new friends and collaborate with others, whether that’s in a business, educational or purely social context.
That’s why today I’m pleased to announce our intention to bring integrated voice capabilities to the Grid. This will enable all Residents to speak with each other if they wish, in addition to the existing Instant Messaging and group chat functions.
Many of you know that voice has always been part of the long-term plan for the Grid, and we truly believe voice can be a transformative technology that will lend more immediacy and dynamism to the way Residents communicate.
Voice in Second Life will offer high-quality communication capabilities with 3D “proximity-based” voice communication. This technology uses spatial awareness, taking distance, direction, and rotation into account, for a more realistic experience. Basically, you’ll be able to tell who is talking in a group since the voice will sound like it’s coming from that direction. We’re also working hard on an initial set of avatar animations, which change and trigger according to the intensity of speech.
A limited beta trial on a test grid launches next week, before a Grid-wide beta takes place later in March, open to all Residents. Official launch is scheduled for some time in Q2 this year, although more details on that nearer the time. There will be no additional charge for using voice for residents or land-owners during the both of these beta trial periods.
We’d love to get people involved at all stages of the beta, so if you’re interested (and serious about) participating in the initial run starting next week, please drop me an email at 3Dvoice@lindenlab.com. All you need is a headset with a microphone. This initial beta trial will occur on a test grid and will require a separate viewer download. Details will be provided by email to those interested. If you’d rather test the voice technology on your own land in the context of the main production Grid, please do not submit a request to the email above for the beta1 limited trial, but wait for the open beta2 to begin several weeks hence.
Clearly, this is a complex technology to deliver, and we’ve been working with a couple of great partners for some time now to ensure that we provide high quality voice, while not impacting the performance or the ongoing development of the Grid. We’ve worked with Vivox to provide a first-class, scalable, end-to-end user experience and brought in another firm, DiamondWare, which has perfected the 3D spatial audio positioning technology. We believe this combination will make for the most realistic voice experience in a virtual world.
We know that many of you have questions about how and when voice can be used, how it works and what the different options are. I’ll start with a brief FAQ that may be helpful as a launch point for specific answers to your questions in the blog comments.
— Joe Linden
Q. Why are you doing this?
Voice is part of our ongoing effort to create a richer, more immersive virtual environment for Residents. We feel it marks a natural progression in the evolution of Second Life, and will prove to be a transformative technology for Residents. We anticipate that voice will be particularly valuable to Resident groups such as educators, non-profits and businesses, who might use Second Life as a collaborative tool for learning and training.
Q. How long have you been planning this?
Voice has always been part of the long-term plan.
Q. Do you see this a wholly positive move? What about the Residents that want to keep their real life identity anonymous? Isn’t this encroaching on one of the main selling points of Second Life?
Our aim has always been to give Residents the tools they need to fully embrace and engage with Second Life in whichever way they are most comfortable. Voice represents an additional option for achieving this, and gives Residents the choice of speaking to each other on voice-enabled land if they wish. For those who prefer it, they may continue to use the Instant Messaging and chat functions to communicate with others. Many of our Residents have been requesting voice for some time now, so it’s clear that for those users, voice will be a boon.
Q. What will Residents be able to do with their voice capability?
There are many ways Residents can use Voice (usage scenarios are outlined below in the technical section), but we anticipate that voice will be particularly valuable to groups such as educators, non-profits and businesses, who might use Second Life as a collaborative tool for learning and training.
For example, academic institutions could use Second Life to carry out lectures in front of a large group audience or corporations could use it for customer training purposes.
Specific technical issues / considerations
Q. Did you develop proprietary technology, or is this a partnership agreement?
We’ve partnered with Vivox to provide in-world voice. Many Residents will already have tried their bright red phone booths in Second Life and our 3D integration is an extension of the infrastructure Vivox has already built out and is successfully serving many existing customers. We evaluated a number of providers and in the end selected a combination of two. Vivox provides a first-class user experience for end-to-end voice services, without impacting the load or performance on the primary Grid. We wanted to make it even more immersive, so we brought in another firm, Diamondware, which has perfected the 3D spatial positioning technology. We think the combination of the two makes for the most realistic voice experience in a virtual world.
Intuitively when you speak with someone, you want a sense of direction and distance. People who are nearer should sound louder, and the person on your right should sound as if their voice is coming from that location. Without that, the voice is flat, and hard to distinguish from other active voices in the area. Our partners give us exactly that ‘3D audio’ feel without additional plug-ins, hardware or bandwidth overload.
Q. Is voice an automatic option, how are you enabled?
Voice capability is linked to the land parcel rather than an individual Resident, so in that sense it’s automatic. The entire Mainland in the Second Life Grid will be voice-enabled by default, with individual land owners able to opt out if they so choose. Private island owners also have the ability to turn on voice as they wish if they’re on a current payment plan (grandfathered plans may require an additional fee).
Of course, if you wish not to speak, that’s fine – there will be an indicator telling other Residents that you are not voice-enabled. While we expect the vast majority of Residents to jump into voice conversations immediately, we fully understand that others may be more cautious or simply decide to stick to IM and Chat.
Q. What equipment do you need?
All you need is a pair of headphones with a microphone – just as you would with a standard VoIP service. Without a headset and mic the sound quality will not make voice communication an enjoyable experience for you or your friends – too much feedback, echo or missing words. If you’ve used one of the popular VoIP services, you’ll know what we mean.
The headsets are widely available from retail and online electronics stores at a range of prices, so there will be one for you.
Q. Can you modify/change your voice? If not, why not?
At present we’re not offering voice modulation or modification. We understand than some Residents may wish to preserve their anonymity even further by disguising their voice. Those that wish can use third party software to modify their voice, but when evaluating it as a standard feature we found the existing technologies just not able to deliver a high-quality result.
Q. Describe how the 3D spatial positioning works
Essentially this means that voices in Second Life will sound as if they are coming from the location of the speaker. So if the person you are speaking to is far away, they’ll sound fainter than one who is closer. But as they walk towards you, their voice will get louder. If you are in a group, the person on your right, will be heard on your right, and those standing to the left of you will sound as if they are indeed on your left.
Of course, all this spatial positioning is specific to you. Others will hear voice locations from their own perspective. This can obviously get quite complex in large groups of moving avatars but was necessary to make the voice experience more authentic.
You can also shout which makes your voice even louder to those standing close to you and those who were previously beyond range, my hear you. We will moderate particularly loud sounds for the benefit of Residents’ ears and also because some people naturally talk loudly or have headsets with particularly sensitive microphones.
Unlike real life, if you do have trouble hearing someone, you can turn up the volume on your computer at a local level.
Q. What are the different options available?
There will be several usage scenarios available in terms of group and private one-to-one conversations:
Scenario 1 – Residents can teleport to voice-enabled land, and automatically start speaking, with the volume of speech modified according to their spatial relationship with others.
Scenario 2 – Group conference calls for two or more Residents. This enables Residents to communicate with large groups regardless of geographical boundaries (e.g. concert setting, multi-sim events, or between pockets of land etc).
Scenario 3 – One-to-one personal communication. This enables
Residents to privately share a conversation, which can be initiated by an Instant Message. Residents don’t have to be on voice-enabled land to do this.
Q. How does the avatar express what they’re saying? Is it through speech bubbles or lip-synching?
When speaking, Residents’ avatars will become animated according to the amplitude of their voice. Residents may disable this feature entirely or easily customize it to match their mood or attitude. There are literally thousands of gestural combinations available.
There isn’t lip synching for the reason that even minor variations in timing of the voice and lips can become very disconcerting, and actually distract from what the speaker is saying. The delays could come from common issues such as minor fluctuations in bandwidth or other tasks taking CPU time. We opted for extremely high voice quality with more generic animation to address this issue.
Q. Can you shout or whisper using this?
Yes – you can do both, and whistle, sing or hum. The codec used in our implementation carries frequencies from 50Hz to 14,000Hz and is ideally suited for music and ambient sound as well as voice.
Q. How does this affect Second Life Grid/server performance? Won’t this make the experience even slower?
The addition of voice should not affect the normal operation of the Second Life Grid since the service is provided by different servers via a third party. Just as video content or music streaming is handled by systems and bandwidth off the main Grid, the voice channels are too.
Q. How does this affect the PC specifications for using Second Life Viewer?
Residents will already have noticed that Second Life works best on a modern machine with a broadband connection. Adding voice will marginally add to the bandwidth requirements, though we have optimized it so that all voices are condensed into a single mixed voice channel, customized for each client. Residents running Second Life on a compatible computer with no other applications in the background via a broadband connection will get the best results and will not notice degradation. Necessarily adding the voice channel does take some bandwidth and some processor power, but no more than other VoIP services which are in common use. Actually, the codec (Siren14) was selected partially because of its low computing resource requirements compared to other codecs of its class.