Rolling Restart Tue-Thu 07/15-17 to deploy server version 1.23.1

Update 2008-07-16 05:40am : We have reverted the ~1000 hosts on 1.23.1 to 1.22.4.

Update 2008-07-15 05:28pm : An issue with names showing up as “(???) (???)” in estate ban lists is showing up on the regions which have been updated to 1.23.1. We tentatively plan to revert those regions back to 1.22 by tomorrow morning, and will probably slip the 1.23 roll-out by another day. We will also be analyzing server crash data from this pilot roll to look for other issues not previously identified, before making a firm decision. — Joshua Linden

Update 2008-07-15 02:22pm : The pilot roll to 1174 regions is complete. However, because of an error Prospero made when starting the roll, there are about 300 regions that will remain down for another 10-20 minutes. For this, he apologizes.

We have identified and fixed the memory leak that was in server version 1.23.0. As such, we will be rolling out server version 1.23.1 to Second Life this week. This includes all of the fixes from 1.23.0– see 1.23.0 blog post for a full list of changes– as well as a fix of the object text newline bug (SVC-2633), and the memory leak.

The server will be rolled out according to the schedule

  • Tuesday, sometime during the day : a pilot roll to 1000 regions. We are going to do a larger-than-usual pilot roll to have a large enough sample to verify that there are no other memory leaks beyond the one we’ve discovered and fixed.
  • Wednesday morning, 5AM-10AM : we will deploy server version 1.23.1 to half of Second Life.
  • Thursday morning, 5AM-10AM : we will deploy server version 1.23.1 to the rest of Second Life.

As usual with rolling restarts, this is a change on the server side; there will be no required client udpates associated with this rolling restart. Regions will receive warnings starting five minutes before they are restarted. There is no way to delay the restart of a given region. Regions should restart within 10 minutes of going down. If your region stays down for more than 20 or 30 minuets, please contact support.

This entry was posted in Announcements & News, Bugs & Fixes, New Releases. Bookmark the permalink.

129 Responses to Rolling Restart Tue-Thu 07/15-17 to deploy server version 1.23.1

  1. Sindy Tsure says:

    /me crosses her fingers. Good luck, Prospero! ๐Ÿ™‚

    Is it odd-numbered hosts first? (is it always odds first?)

    Does this push the MONO plan back any?

  2. Hopes for the best and that the world does not collapse ๐Ÿ˜‰
    I guess mobit and Turbo wil be as usuall on the last wave of the deploy.
    Looking forward to a more stable version.

  3. Prospero Linden says:

    The hosts for Tuesday will be randomly chosen.

    I’m not going to tell you odd vs. even for Wednesday, but I will tell you why I’m not telling you. The reason is, it would give the false impression that I’ve given you useful information. Regions are *not* locked to hosts, except during rolling restarts. Each time a region restarts, it starts up on the next available host, which could be any host of the same class.

    Once the rolling restart begins, I will announce whether I’m doing the odd or even hosts, because regions *are* locked to restart on the same host during a rolling restart.

    Re; MONO : the default plan is that, if all goes well, Mono will go the second week after the 1.23 deploy. That would mean, if all goes well, Server version 1.24 (including Mono) would be deployed the week of July 28. This will slip if there are more serious bugs discovered in 1.23 (requiring either a delay of the 1.23 roll or a patch roll), or if it takes longer than expected to get Mono (and some other patches) merged into the server code and stabilized.

  4. A couple of thoughts:

    1) there is a huge event going on this week in SL called Relay For Life. A heck of a time to be doing rolling restarts with all the events planned, and have been in the planning stage for months.

    2) Maybe for the initial “testing” rollout you could have people sign up sims for those, just like you did for the havok 4 testing. This list would always be used for the first phase of the rollout and maybe could be tested for a week before full rollout, this would give you a better test bed. then you can rollout to the rest of the grid the following tuesday and wednesday. I would be one to sign up a couple of sims for this, both full and openspace.

  5. pork chops and applesauce says:

    Yuk. no solid info. No solid releases . double restarts last week on regions . Weekly restarts now. And even a gamble on weather or not this release will work .
    lol Tuesday, sometime during the day : a pilot roll to 1000.
    Why no time? not ready yet? going to just throw it to the dogs and see?

  6. Cincia Singh says:

    Good luck with the roll Prospero! I’m really looking forward to the fixes in 1.23.1 especially the Group IM bug fix! Keep up the good work!

  7. Prospero Linden says:

    Spritely, re: (2)… the plan is to develop a list of regions designated for the initial rollout to use in the future, including those who want to opt in to it. That still won’t be for a full week though. However, we are putting major upgrades on the Preview Grid a week before it gets deployed to Second Life, and are encouraging people to test it. (The reason this week’s deploy wasn’t on the Preview Grid for a full week is that it contains just two very small, scoped changes to the 1.23 code which has now been on the Preview Grid for nearly two weeks.)

  8. Soap Clawtooth says:

    I’ll sit here quietly and wait for the MONO roll out to go horribly wrong and the subsequent rollback to 1.23

  9. richard says:

    number 3 no gambling allow !!!!!

  10. Ann Otoole says:

    Hope it goes well for your team Prospero. Nobody likes a rollback especially the people that have been busting hard to get it done in the first place.

  11. Sindy Tsure says:

    Ok. Thanks for the info!

    I do understand about sims coming up on different hosts but, since H4, we don’t get sim restarts very often any more! ๐Ÿ˜›

    I still think it’s sorta-useful to know odd/even ahead of time but that’s mostly because I’ve been drooling over SVC-2485 ever since Simon said he had a fix about a month ago.. :\ No worries if you think it’ll do more harm than good to let us know before the roll starts – I can wait a little more. ๐Ÿ™‚

  12. sloopy cooder says:

    I have high hopes for these next few releases… Mono will be a huge win. I never thought I’d say it, but I AM seeing better stability on the asset servers too! I’m amazed!

    I am seeing an increased level of region handoff problems, and teleports seem to be a lot slower the past week and are failing a little more frequently. Not sure if this is a networking issue or what…

  13. Xingyun says:

    greetings, sorry off topic post – only sl web page i can get to. is SL down can’t login to world, sl homepage, support, grid status. is anyone else having these probs (past 12 hours)?

    best of luck with ver 1.23

    thanks

  14. Paddy Wright says:

    I just wanted to say that the SL environment has been more stable than I have ever known. I am sure a lot has to do with the latest Release Candidate. It almost refuses to be broken. Somehow, someone, somewhere in LL is getting it right. My thanks.

  15. Argent Stonecutter says:

    SL has been stable, I’ve only had a few drops from region crossing in the past few weeks, and only a few cases where sculpt textures haven’t loaded until I’ve selected an object. But I’ve been seeing longer “grey times” where I’m waiting for objects, textures, and avatars to rez in after teleporting than I’ve seen for a long time.

    I can certainly live with that, if that’s the cost of a more stable SL!

    However, if you implemented SVC-2413 then the “grey time” would be less of a problem, because things closer to the avatar would rez first, so the process would be less noticeable.

  16. Prospero Linden says:

    Re: things being more stable, it’s not your imagination. We have two central systems which have been the source of most of our problems in the last month. There’s the central database cluster, and the asset server. (They are two different things.) We have modified some code to take load off of the central database server, *and* upgraded it to a beefier machine, so it’s in fairly good shape at the moment. (It’s not perfect, and problems elsewhere– as we’ve had once or twice with some code on other central servers– can still cause too much load to be directed to the central database machine. But it’s better than it had been a couple of months ago.) The asset server was in better shape after we swapped from the one in SFO as primary to the one in Dallas as primary. (This allowed us to get the techs from the solution provider in to do some work on the SFO cluster.)

    Hopefully we’ll be able to keep our heads above water.

    Re: things staying grey longer, I’ve noticed that too, and have heard it from some others. At this point, though, alas, it’s at the stage of several of us noticing that “it seems to be taking longer for things to rez”. Problems with region transitions could well be related to this. We haven’t yet nailed this down to even know where or what the issue is.

  17. Argent Stonecutter says:

    Dagnabbit, I can’t think of a way to test the change in rezzing without a time machine. ๐Ÿ™‚

    Still, improving the initial interest list order would be a win… it’s not like initial rezzing was *ever* fast. ๐Ÿ™‚

  18. Cappy Frantisek says:

    Good Luck Prospero.

    “We who are about to die salute you….”

    LOL

    *runs for the fallout shelter

  19. Hyddin McMillan says:

    LOL@18, We have been through worse and it has ALWAYS gotten better given time! Good Luck Prospero and CHEERS!

  20. Yep … stabler … a lot in spite of over 60K logons, especially in the weekend and while being in a fully packed sim. Only ‘crashed’ ( proces termination ) once or twice. Visited and listened to a live gig whithout hassle or crashing. Encouraging to see and actually experience stability improve .

    One oddity … At 11:00 am my time today ( GMT +1:00) i got the message ‘region full’ when requesting friends to be TPd to me. The sim i live upon had to be restarted to fix that. Dunno if that’s related to this roll-out.. and now my objects keep jumping up and forth whilst editing ..extremely annoying .. anywho.. ๐Ÿ˜

    Oh.. and does the memory leak fix mean my FPS will not drop from 18 to 4 after a certain amount of time ? I sure hope so.

    Kudos.

  21. Moll Dean says:

    Hi Prospero.
    Hello everyone.

    I am having a good feeling about it… and we will have a better SL in version 1.24 with Mono.
    I am using the viewer 1.19 that is loading things faster than lastest Release Candidate and it is really more stable than before.

    The unique bug I have is when I TP home that it is over 500m and the floor become phantom. I didn’t find any ticket regarding this, but I am sure it will be fixed soon

    GOOD LUCKY

    hugs

  22. taff nouvelle says:

    @16

    Thereโ€™s the central database cluster, and the asset server. (They are two different things.) We have modified some code to take load off of the central database server,

    Could this possibly be the reason for longer load times of textures causing more grey time? Just guessing here though.

  23. taff nouvelle says:

    @20.
    If you are still using the standard client 1.19 — there is a bad memory leak in that, it has been cured in the RC version, and is amzingly stable now

  24. Dante Tucker says:

    @20 This is a memory leak for the server thats fixed. So it wont help your computer, it will help the computer the server is running on ^_^

  25. DISGUSTED IN LINDEN LABSโ„ขยฎ *!@$!! says:

    something is not fixed again same as the other day it keeps making me lose my damn internet connection i do not know what you guys have done but its not good again

  26. taff nouvelle says:

    @24.

    There is nothing that LL can do to make you lose your internet connection, I suggest you contact your ISP, or check your modem connections ๐Ÿ™‚

  27. Ska says:

    Sorry to be off topic, but is anyone having problems logging into sl itself and the website?

  28. taff nouvelle says:

    @26.
    Ska, it all seems ok to me, have you changed anything since your last loggin, firewall maybe?

  29. Sandling Honey says:

    Yes, I am having issues too Ska – Since today, without having changed anything.

  30. DISGUSTED IN LINDEN LABSโ„ขยฎ *!@$!! says:

    @25 oh yes there is i have been on SL along time and many times i have had my ISP test it while SL is doing upgrades and it will lose connection now please don’t tell me it cannot happen… last week this same issue happened now they are started rolling restarts again and im having disconects again it is ONLY while on SL and it happens to my PC and my laptop same

  31. Ska says:

    @Taff: Hiya, and thanks for your reply. I didnt change anything, checked with the modem, turned off firewall just in case, reinstalled sl… finally to find out my alt logs in, my main just doesnt.

    @Sandling: Sorry to hear that Sandling, but I am glad I am not the only one. I just tried logging in with my alt though, and that worked just fine. It’s just my main that refuses to log in. Waaaah!

  32. Curtis Dresler says:

    Disgusted, perhaps SL is your only continuous connection? Most Internet connections are self-healing and you wouldn’t notice a disconnect even when browsing. I only notice it on occasion because the laptop has to toss up a large splash screen when it reconnects. The other machines aren’t so profusive.

    I’d love to see an explanation sometime of how the various servers work together. I assume each application server isn’t a single machine, but a cluster for each (if so, is load balancing one of the issues?). Also curious how SL sees the SL servers interconnecting with open sims in the future – how SL residents are going to move beyond SL borders. I know there are no details and perhaps no roadmap yet, but surely someone has some percolating ideas to share…

  33. Sandling Honey says:

    Ska: Apparantly a DNS issue or something. I’m with the Xcess4All provider in The Netherlands. I connected wireless and am able to connect with my laptop trough another ISP, so yeah.. perhaps some colocator issue again or a DNS failure (Sadly not very technical here, but I’ve seen this happen before)

  34. Ska says:

    Sandling, I am also with Xcess4all in The Netherlands… but whats funny is that my alt connects no problem while my main gets stuck in logging in forever, then moves up to connecting to region and gets stuck there too. I am less technical than you so I have no idea what you mean with colocator :S

  35. err.. ok

    @:22 Running Second Life 1.20.13 (91658) RC.

    @:23 .. touchez.. ๐Ÿ™‚

  36. Kenny Devoix says:

    I was losing my connection when SL now and then also,finally traced it to my router. Which over a period of 3 or 4 years had become weak but only when both the wife and I were in SL using a lot of bandwidth would it act up. New router problem gone.

  37. Argent Stonecutter says:
  38. Sandling Honey says:

    Ska; Exact same issue! – I can log in with my alts too, though I have not tried to stay logged in with my alts for a longer time yet. What a weird issue.

  39. Kenny Devoix says:

    @37 Sandling, It might possibly be that one avie is on the SF server and the other is on the one in Texas I believe. Its letting you connect to one but not the other. Don’t have a clue how to fix it but has run into that in the past myself.

  40. Ska says:

    Sandling, very weird indeed… I’ll keep you posted if anything changes. Good luck!

    And sorry to the rest for being off topic, dont hit me *smiles sweetly*

  41. DISGUSTED IN LINDEN LABSโ„ขยฎ *!@$!! says:

    @31 it is definately related to SL it only happens when i use SL and they have rolling restarts going on or back end issues every dang time never fails same thing last weeeek at he time they were doing restarts canceled them and back tracked had same diconnects and they resolved when sl resolved but when u tell them in live help it always blamed on my ISSP and my ISP has checked it many times with me and its SL causing it

  42. taff nouvelle says:

    @40,
    I am afraid SL cannot cause your ISP to disconnect you, BUT , it can show up a poor connection to you ISP.
    That is probably what is happening, you need a rock solid connection to run SL, while you can surf the web with a flaky one without problems.

  43. Norma Desmond Junior says:

    Sadly, I too have had my dsl internet connection broken MANY MANY times -when downloading inventory after cache clearing. Been happening for months. And at least 30 kernel panic full computer crashes in last week. I see big blocks on screen and the gray rezzing. My computer and dsl line check out just fine. Only when in SL does this happen.

  44. Yshmael Zapatero says:

    i pray that it works i have crashed so much all my bones are broken my poor avi should on life support.

  45. Zi Ree says:

    Linden Lab really needs to do this test, where they announce that they worked on something and restarted a service and then watch people go crazy like “now it fails, it worked before!” or “everytime you guys change anything my connection breaks!”. And then they reveal that they didn’t change anything at all … ๐Ÿ™‚

    This is really one of the biggest psychological playground ever created ๐Ÿ™‚

  46. Elliott Eldrich says:

    I’m hoping this new server release will address the #1 problem that I’m experiencing, which is periodic freezing that happens in any client newer than 1.19.0.5 (anything with Windlight). I’ve been trying the 1.20 RC series, and the problem never goes away. I’m running a generation 1 Mac Pro with 10 GB of ram(!) and a new Nvidia 8800GT video card. Have the very latest version of OS X (10.5.4) yet keep experiencing this lockup issue.

    Sometimes when I run the newest RC (1.20.13) it will run just fine for hours, with no lockups at all, or sometimes just a couple of lockups early in the session, and then it clears up. Other times I’ll be running it for a while, and then it starts locking up, and gets worse and worse, until I have no choice but to log out because it’s locking up continually (freezes for about ten seconds or so, then unfreezes for maybe a couple of seconds, then locks up again, then unfreezes for a couple of seconds, and just continues like that until I log out.) I’ve tried going away from the computer for a while and then coming back to see if it clears up, but once it gets into that mode it never recovers, and I’m forced to log out.

    It is my fond hope that this issue will be fixed soon. I’d really like to be able to run a Windlight based viewer.

  47. Lukas Mensing says:

    I noticed a new behavior since a week or about:
    Avatars do not rezz immediately, you first see like a kind of white haze then they start rezzing quite slowly…
    In fact, I wonder if its possible to keep this haze in some cases as an avie?
    It’s also more common when it’s bots (nothing to do with complaining about bots on sl, I do it in other posts :).
    Did anyone else notice this?

  48. Dekka Raymaker says:

    @ 46, this is the new Ruth.

  49. Lukas Mensing says:

    @Dekka
    Thanks Dekka!
    Well, I love it. Since I don’t use “much” of my avie, I tried to be “haze-ruthed” definitely but couldn’t…Is there a way?

  50. Xen Akula says:

    “That’s right, you know what time it is, keep on rollin’, baby”

  51. awnee dawner says:

    @45
    hey, ive had bad freezes in the past (viewer 1.19.1.4 imac intel c2d 2 Ghz 1GbRam/128MbVram ati x1600 osx 10.4.11)
    computer freezes completely no force quit possible – have had to power off, …
    however i have set my “viewerpreferences”
    Graphics/HardwareOptions/Texture Memory to -> 1/4 in my case 32Mb
    no freezes the viewer runs fast and smooth as butter now, i can stay on 24/7 exept when we have a rolling restart ๐Ÿ™‚ or my isp looses connect.

  52. Tanya Spinotti says:

    Prospero and the team – Thanks for hammering away at the code to make it more stable. Also, thanks for having the guts to admit when things aren’t right and to revert them. Keep up the good work.

    And as there seems to be a “let’s blame LL for random things” theme here:

    I notice my car uses more fuel after I’ve been on SL the day before. What are LL doing about that? Does the new simulator code have a fuel leak? I shall be writing to my local politician forthwith ๐Ÿ˜‰

  53. Soap Clawtooth says:

    Question: Since you’re so proud of all the big shiny masses of profit you’re making out of we, the residents. Why don’t you send a Happy Happy Linden out with a shopping list that has the words ‘MUCH BETTER SERVERS’ imprinted in bold text upon it? HMMM?

    ๐Ÿ˜€

  54. Prospero Linden says:

    Eliott : just a reminder, this is a **server** release, and will not change anything in the client.

  55. taff nouvelle says:

    @51.
    Its probably from driving fast cars here, your foot tends to drift downwards on the gas pedal, hence uses more fuel, SLOW DOWN :-))

  56. Linda Brynner says:

    After a week or so I tried SL again… it stayes a mess;
    unplayable.

    Don’t know what is wrong…

    – SL’s website doesn’t trigger bandwith; loading takes minutes.
    All other websites load perfectly and fast, except SL’s.
    – Inworld crash every other 15 minutes.
    – Inworld the bandwith drops every other minute to 0, making my
    avie get stuck or drift.
    – Masses of deserted places inworld.

    C u all next week or so…
    But maybe I really quit now…

  57. Alyx Sands says:

    Prospero, what was the evil thing you did wrong with the restart? ๐Ÿ˜‰ Just kidding! I appreciate your hard work. (Although I’m still trying to figure out how to measure time in minuets…a minuet is several minutes long… ๐Ÿ˜€ )
    *loves funny typos*

  58. Simon Kline says:

    @51,54 I think havoc 4 has improved my fuel effeciency, pre havoc 4 my car was driving INTO the ground, using much more fuel than needed, now it seems to sit on top and use far less! My RL car seems to be running much better now thanks to this, I think.

    @55 Linda I hope things get better soon for you, are you on a wireless network by any chance? IM me in world and i’ll help you trouble-shoot your problems. SL is worth persisting with ๐Ÿ˜€

  59. Balpien Hammerer says:

    Hey! I have been timing my region for weeks preparing for an event. The script load was consistently around 0.4ms. Now after this restart, the same script load shows as 0.8ms. What was done to slow down scripts by half?

  60. Elliott Eldrich says:

    Prospero: You said “Eliott : just a reminder, this is a **server** release, and will not change anything in the client.”

    I realize this, but I also think that there is at least a chance that the problems I’m experiencing may have something to do with the server side code. Maybe I’m totally wrong on this, but I can’t help but think that the server code is part of this problem. Regardless, it sure would be nice to see this bug fixed.

  61. Yukinoroh Kamachi says:

    Say, you forced my product out of sales four months ago when you implemented havok4; are you going to fix buoyancy soon ?

  62. Balpien Hammerer says:

    #60, Yuki, I think buoyancy is as it is now. Almsot everyone else redid their scripts. There’s a long thread on this problem in the jira report.

    BTW, on the performance issue, bug SVC-2649 filed, I now see time dilation of 0.96-1.0 when before this rolling restart it was a solid 1.0. Wings that used to sppear/disappear nearly instantly upon hover/descend now that 3-4 seconds to react.

  63. Joshe Darkstone says:

    rolling restart again? was wondering why revenues were a bit sluggish ๐Ÿ˜›

    erm, floating text bug back? alll on 1 line for me. resetting the scripts works, but alot of our floating text objects have the scripts deleted once we put them out :/

    @stability – assets do indeed seem to be alot more stable lately, good job that ๐Ÿ™‚

  64. Meg says:

    @ Tanya.

    Blame Teagan.

    ____________

    No really, thanks LL, this is one step in the right direction! : D

  65. Seth Ock says:

    Linda (@55), I wonder if your ISP is interfering with SL traffic? Remember how a number of ISPs were throttling or even tampering with P2P connections a few months ago? What if your provider sees Second Life traffic as too demanding and is deliberately interfering with the connection? Don’t expect them to admit it though, and it’d take a detailed packet analysis to reveal.

    Typically, I’m able to stay online for hours at a time before memory leaks overwhelm my poor, solitary gigabyte of RAM. I haven’t had major stability issues for quite some time (running the release candidates), although I find the top quality graphics settings now cause me to crash, so I run with just the basics and have to give up SL photography for a while. And I do all this over a wireless connection using a three year old laptop.

  66. Kamachi says:

    @61 : The thing is that my product relies on small ApplyImpulse’s, if I implement a vertical movelock, these won’t work anymore.

    Also, for the test script that I posted yesterday in SVC-1792, the only thing that seems to counter the sinking is a vertical MoveToTarget, which really scatters the avatar x-y moving.

  67. Argent Stonecutter says:

    If only you could use the vehicle code in attachments. ๐Ÿ™‚

  68. Kamachi says:

    Believe me, I have tried countless times to fix my product since havok4 release, but each time the fixes had side effects that made me want to quit using it. I believe there are enough votes and watchers on SVC-1792 and SVC-2013 for Linden Labs to do something else than just assign a DEV number to it and make the bugs sit in the process for months!

    If I had access to the server source code, working on a patch would be the first thing I’d try to do, even before playing the game itself.

  69. Rick says:

    Will this sim restart help with the mainland search as most sims seem to be 10mins lag on search results and yet if you tp to another sim you get a realtime result from search basicly each sim is displaying diffrent results driving me mad lol… have a tinker with the land search server while your at it.

  70. glow Raymaker says:

    Has Prosperos Inferno torched the survey that didnt load when i tried to login as well??? And Login faild 3x in a row so ive given up trying. Looks like SL as usual! Good think I havnt logged in for the last week or so as nothing seems to have changed from the last yr that I was a premium account holder. So im just happy I reverted back to basic so at least im not paying for this grief now.

  71. Zorins says:

    I have to hand it to you guys, the difference is palpable, and we all know why (not that it is a slight against those involved). M has got the grid on-track, unquestionably. I mean, it was not long ago that the grid would just be completely down with no warning, and now we get a warning, a small quality run of the update, and a reasonable schedule that disperses the downed regions over a time-slot that reduces the inconvenience to residents. Good job.

    Now all we need to work on is the, uh, quality of the avatar mesh, and I’m a happy resident (Make Human would be good people to talk to there).

  72. GC Continental says:

    @24: are you on a wireless? If so, and using a linksys router (WRT54G in particular), turn off any cordless phones in the area. I had this same issue. The router and the phone were in each other’s radio space. I switched to a netgear and haven’t had a problem since.

    SL itself cannot bring down your connection unless the ISP is seeing the traffic and freaking out. (*extremely unlikely)

    I STILL haven’t actually installed the newest RC. Maybe I’ll do that right now and go looking for a 1.23 sim.

  73. Bau Ur says:

    I appreciate the integrity of the guy who will say whose fault something is, when the fault is his. I’d rather have a guy like that on the job than a guy that apparently never makes mistakes. Good form, Prospero. 20 minutes extra down time. No one dies. It’s cool.

  74. Ann Otoole says:

    yes avatars no longer rez properly. the attachments sit there in air till you zoom into each one. Just witnessed a parcel *auto return* for a bunch of stuff properly set to group. Apparently there are numerous things in the mix right now. Hold on for the ride lol!

  75. Xingyun says:

    greetings all,

    sorry off topic a bit, but i haven’t been able to logon to sl for a day now, also can not login to secondlife.com websites. this site works, blog.secondlife, grid websites work, linden lab.com works, just strange and haven’t changed any of my settings either.

    well best of luck with the rollout-hope to login one day again, lol

    using time-warner cable, anyone else having probs on time warner?

    thanks ๐Ÿ™‚

  76. Xingyun says:

    justa quick, follow up tried pinging servers here are the results

    C:\>ping blog.secondlife.com

    Pinging lindenlab.wordpress.com [72.233.2.56] with 32 bytes of data:

    Reply from 72.233.2.56: bytes=32 time=53ms TTL=50
    Reply from 72.233.2.56: bytes=32 time=52ms TTL=50
    Reply from 72.233.2.56: bytes=32 time=51ms TTL=50
    Reply from 72.233.2.56: bytes=32 time=52ms TTL=50

    Ping statistics for 72.233.2.56:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    Approximate round trip times in milli-seconds:
    Minimum = 51ms, Maximum = 53ms, Average = 52ms

    C:\>ping google.com

    Pinging google.com [64.233.187.99] with 32 bytes of data:

    Reply from 64.233.187.99: bytes=32 time=46ms TTL=240
    Reply from 64.233.187.99: bytes=32 time=44ms TTL=240
    Reply from 64.233.187.99: bytes=32 time=42ms TTL=240
    Reply from 64.233.187.99: bytes=32 time=51ms TTL=240

    Ping statistics for 64.233.187.99:
    Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    Approximate round trip times in milli-seconds:
    Minimum = 42ms, Maximum = 51ms, Average = 45ms

    C:\>ping secondlife.com

    Pinging secondlife.com [8.4.128.238] with 32 bytes of data:

    Request timed out.
    Request timed out.
    Request timed out.
    Request timed out.

    Ping statistics for 8.4.128.238:
    Packets: Sent = 4, Received = 0, Lost = 4 (100% loss)

  77. Noisey Lane says:

    “All” Search not working for me now. Other categories search is ok.
    Is it just me? – Second Life 1.20.13 (91658) Jul 8 2008 16:01:06 (Second Life Release Candidate)

  78. Urantia Jewell says:

    @75 Xingyun

    Try pinging (and do trace route) to “data.agni.lindenlab.com”.

    @ Prospero

    Good luck with the roll ๐Ÿ™‚

  79. Moll Dean says:

    Thank you for the post update.

  80. Please update this blog, my region was updated without warning and after I did a log back I did a return all, have 15000 prims available, but 1000s of ghost prims that I am deleting manually. And this all for a deploy of 1.22.4 not 1.23.1!

  81. I wonder if for those who like it wouldn’t be a cool feature in one of the next server software to have a ‘random destination’ button in the map as it may save a lot of load on my external server and also on your servers as Hopping became kind of popular?

  82. Leeloo Nykvist says:

    “However, because of an error Prospero made when starting the roll, there are about 300 regions that will remain down for another 10-20 minutes. For this, he apologizes.”

    What did he do? Type “init 0” rather than “init 6”, so that someone needs to go down there and hit the power buttons to turn the servers back on?

  83. Soap Clawtooth says:

    “We tentatively plan to revert those regions back to 1.22”

    *Whacks you with rolled up newspaper*

  84. Shibari Twine says:

    /me nominates Prospero for the second annual Sidewinder Linden good communications award.

    Great job Prospero, and we know that rollbacks are painful, but we also know you guys care about us.

  85. Spank Lovell says:

    LOL @ Soap #81

    I’d have to agree with Spritely right at the top of this blog entry re: using sims that have signed up for testing.

    Using the preview grid and relying on users good will to test it patently isnt working Prospero.

    I have the hugest resepect for you and your team out of all the teams at SL, its always seemed to have its tasks like this well planned and executed, however lately there has been more roll backs of server code because of bugs that were not spotted in testing. Although I have to say the rollback mechanisms you have developed seem to work extremely well.

    Spritelys idea is excellent and would allow a thorough testing to be carried out concurrently on the live grid instead of peacemeal as it is currently on the preview grid.

    But all that said you get a *BIG* thank you for all the efforts from your good self and your team. its much appreciated ๐Ÿ™‚

  86. ** YOUR AD COULD HAVE BEEN HERE **

  87. AWM Mars says:

    What is more prolific here, is the level of information given about current and future updates/upgrades. For the first time in several years, we are not being treated like rats in the forever moving maze.

    It has and will breed more contentment as LL are seen transparently to be focusing on upgrading the platform to keep up with the influx of newcomers and layers of business, giving us stability and more reliability. This has been the main focus of all those that clamber to make posts on the blog, at last the cries have been heard.

    What I see, is LL coming back into the community, from appearing to shrink away for the past year or more. Excellent.

  88. Fredo Chaplin says:

    Come on guys !

    Same old song over and over…

    Roll out, roll back, roll out again, delayed ! Endless story.

    And all that without any respect for the hard work of all the event builders who suffer (or die…) from that.

    I’am sure you guys make a hard work too, and I totally appreciate what you do for us.

    But I cannot suppport HOW you do that this way. It’s really time to use professional testing and updating procedures, and stop this daddy-like way of doing things.

    The only quiet period we have had thos past months has been the god blessed two weeks long SL5B period !!! Surprisingly no roll out during that period.

    Hey guys, it’s time to grow up !!!

  89. Prospero Linden says:

    What did he do? Type โ€œinit 0โ€ณ rather than โ€œinit 6โ€ณ, so that someone needs to go down there and hit the power buttons to turn the servers back on?

    There is a system called the “region conductor” which is what is in charge of handing out regions to simulators to run. The region conductor has a throttle that limits the number of region starts per second, so as to avoid choking the central database by starting up too many regions at once. The default value of the number of starts per second is VERY conservative… we did this because we wanted to start at a safe value. It turns out that we can start regions a factor of 10 faster still without stressing the database. Most of the time, it doesn’t matter, because regions aren’t going down and restarting fast enough. However, during a rolling restart I open up the throttle on the region conductor so that regions will restart at the rate necessary to keep up with the rolling restart. I forgot to open up the throttle until the pilot roll was nearly done.

    Sorry about that :/

  90. Prospero Linden says:

    Mercia — what you describe may not be related to the rolling restart. If you have issues like that in your region, please contact support. (If this is a private island you own, you have access to concierge support.)

  91. Cappy Frantisek says:

    whee! rollback, sounds like a WalMart thing. Oh it’s a rollback thy fixes the changes that broke the fixes…..

    *me head spins*

    Any more room in that fallout shelter?

  92. Leeloo Nykvist says:

    “I forgot to open up the throttle until the pilot roll was nearly done.”

    Not a funny whoops, then… Yeah, things like that happen, we are all human. Lots of information in your post, even though I don’t think it’s actually usefull for us (we can’t touch those things anyway), I’m sure the new knowledge will be appreciated by those of us curious about the inner workings on SL.

    Is it just me, or did LL really open up the information flow lately? Seems like they cut out the distortion filter also knows as marketing drones.

  93. Since 24 is my lucky number, let’s just skip 1.23 and go directly 1.24 ๐Ÿ™‚

  94. sloopy cooder says:

    RE Networking issues (somewhat OT)

    The way MOST ISP’s work is hot-potato. Your local ISP will probably try to hand off data to LL’s ISP (Level 3) ASAP and the nearest point.

    In kind, LL’s ISP will try to dump data back off to YOUR ISP at the nearest return path.

    This can lead to WILDLY different performance / reliability for upstream versus downstream data, as well as very different paths.

    One thing is certain in all this. LL’s Level3 service has had lower reliability than my home DSL service… And THAT, is pathetic. Unfortunately, I’m pretty sure that LL’s contract with Level3 doesn’t give LL much of an out. I know, I’ve been there, and have dealt with contracts with several data centers. Of course I got burned once about 10 years ago by a colo with crappy service, so now all my contracts are TIGHT and are not so one sided. Along with that, none of the data centers I use have as many issues as Level3 either. I’m sure LL has learned from this, and I sure as hell hope they will renegotiate with the time comes up, or move to a better (carrier neutral) facility.

  95. Azte says:

    All I can say is all hell has broken loose on our sim which was part of this restart and subsequent rollback. I have ducklings flying like supersonic jets thru the sky and off world, a donkey that basically fell apart and an incredible amount of lag making the sim impossible to be in. Fingers crossed that things will improve.

  96. Prospero Linden says:

    Atze : if you have problems and can detail a clean reproduction, please file an issue in our public issue tracker with as much information as possible so that a Linden developer or QA person can reproduce the issue. https://jira.secondlife.com

  97. torridluna says:

    lol, Zi! ๐Ÿ™‚

    /me cannot wait for Mono on the maingrid, I really hope all goes well.
    Regarding the textures not loading: This was very noticable today.

    Cheers,
    Torrid

  98. ChatNoir Moonsoo says:

    This is not directly related to a server-discussion, but since we’re getting a lot of feedback here:

    In the Advanced – Rendering submenu is an option “Run Multiple Threads”. What exactly is being threaded, if that option is active at all?

  99. uh-oh says:

    Does anyone know how to fix the huge triangles that show up on the screen that blocks your view?
    I gave a friend of mine a 1 yr old computer with 2 gigs of ram, an ATI 256 video card and has a AMD 3500+ processor.
    This computer was working fine and I only gave it away so we could keep in touch after he moves to another state using SL.
    Is this the issue with winlight everyone is talking about? I had great performance in SL on this computer and only bought another one to increase my ram and found I had a better selection of video cards that have the PCI express slot.
    So this computer is not an old boat anchor.

  100. Darien Caldwell says:

    @98,
    The Run Multiple threads is active. It runs the texture decode in a separate thread if enabled and if your system can run multiple threads of course. Just how much of a benefit it is can be debated. It’s currently an experimental option and not guaranteed to work flawlessly.

  101. Aquarius Paravane says:

    Re Spritely’s idea about signing up sims to try new server releases – how about the Linden mainland sims?

  102. Joshe Darkstone says:

    @Disgusted I for one would like to say that this IS a more professional way to manage the rollout. They chose a larger initial group of servers in order to identify a wider variety of issues. Once these issues were identified and a rollback deemed necessary they did what they had to do. You cant test a widely distributed platform completely wihtout the feedback of the user base on the live grid.

    I would opt for a longer period of testing between the initial rollout and the gridwide rollouts, to provide enough time for more of these issues to be identifed then the usual 24 hours.

    I would also opt for a less frequent rollout then every other week, to minimize the effect on the economy. Having once been told that there should be no effect on the economy I have carefully correlated my income to the rollout schedule, and except for a few other dates where large service interuptions have occurred unrelated to a rollout, my low performing dates correspond with the rollouts quite well. Add to that the fact that a rollout has never proceeded without a corresponding lull in sales performance and I think I am quite right about the effects on the economy. I have asked a number of friends about their experiences, notably while a rollout was in progress… “what kind of day are you having” and they have expressed the same experience.

    Of course, thats anecdotal evidence, LL could/should have a closer look themselves, since they have the data. Not sure they want to know ๐Ÿ™‚

  103. Tristin Mikazuki says:

    So are you going to do the rolling restart at 5am tomorrow (thursday the 16th)?

  104. Ron Crimson says:

    @102: I can’t help but wonder if the in-world economy sags during rolling restarts *not* because people are having technical issues/crashes/transaction failures/what-have-you *but* maybe because they simply figure “Ohwell, it’s another rolling restart” and don’t even bother to log in, opting to “wait it out” instead? That’s only speculation of course (based on, as you call it, anecdotal evidence) but… well, Zi Ree summed it up perfectly when she said SL is the biggest psychological testing ground ever ๐Ÿ˜‰

  105. Joshe Darkstone says:

    @104 I’ve considered that, and the possibility that some people are just fickle enough to postpone a shopping trip when the server they are on restarts – get up and walk away and not come back (you can do that???) but the effect is the same regardless of the reason. It seems reasonable that its a mix of these issues, 1000 servers restarting, requesting assets, announcing themselves to each other and the grid, is bound to stress resources, timing out transactions, etc. yesterday was “a bit sluggish” but not horrible.

    Rolling out half the grid aggressively would magnify that impact on the infrastructure and I can measure the impact of that in double digit USD losses per day. I even have it graphed, with rolling restarts mapped into the data.

    The point is, whatever the cause, a rolling restart affects the in-world economy in a significant way (about 30-35% of sales across the 3-day period in my case).

    If that were taken seriously then a measured rollout schedule, once a month perhaps, would cut the negative impact in half, reduce the number of rollbacks by giving then more time to test those limited rollouts and get it right before a gridwide deployment.

  106. Laraya Mills says:

    Prospero…the way you tackle the issues and the way you interact in the blogs, I appreciate very much. Gives a certain feling of some “transparency” to me.
    As far as I am concerned I believe, you are a an enrichment to SL (was that english?). I hope you guys can fix everything swiftly, so that no more roll-backs will be neccessary. But even this is already some improvement to earlier times I remember, when some version which were mal-functioning remained in the system, till it was overwritten by the next newer version.
    Good luck to you. Greetz – Lara

  107. Scarlett Glenelg says:

    Hurray !!! 1.23 is to bring us a fix for the IM problems !!! Finally !!! Please give us 1.23 !!! We’ve coped with failing IM’s way too long already !!

  108. Ron Crimson says:

    @104: Hmm, food for thought. Maybe the impact on the SL economy could be reduced by spreading out rolling restarts over 3 days instead of 2, and throttling the rate at which region restarts occur somewhat so the database load allows for more breathing room (I hereby refer to Prospero’s excellent explanation @ 89). I guess ultimately it’s a trade-off between shorter rolling restarts or a more healthy economy in – as it were – times of war. We need opinions on this matter. ๐Ÿ™‚

  109. Cappy Frantisek says:

    Here is a crazy idea.

    Go back to the old days when a restart would be the whole grid at one time. Kick everyone off, load the new code, let logins back and have a full force watching the grid as it comes back to nominal load.

    I don’t mind being out for one full day but back and forth seems to break more things than it fixes.

  110. Cappy Frantisek says:

    104 Ron Crimson Says:
    “@102: I canโ€™t help but wonder if the in-world economy sags during rolling restarts *not* because people are having technical issues/crashes/transaction failures/what-have-you *but* maybe because they simply figure โ€œOh well, itโ€™s another rolling restartโ€ and donโ€™t even bother to log in, opting to โ€œwait it outโ€ instead? Thatโ€™s only speculation of course (based on, as you call it, anecdotal evidence) butโ€ฆ well, Zi Ree summed it up perfectly when she said SL is the biggest psychological testing ground ever.”

    I don’t think it’s so much wait and see as it is people are afraid of not having transactions work at all. Too many times I have heard my friends say thaey were clearing bad transactions.

    So see my earlier post. If the whole grid came back at once, there would be about 30k pair of eyes watching the load factors (the other 30K would keep a log!). LL would find out REAL FAST if there was a transaction problem. Bots can’t send message when a transaction fails or a teleport fails but humans can and will. Burn up the telephone lines concierge members!

  111. Zi Ree says:

    I assume it would be the best if Linden Lab didn’t notice the users at all that there is a rolling restart going on. Nobody would notice, other than a sim going down for 10 minutes and then coming back. I’m absolutely convinced that nobody would feel any adverse effects if the Lindens did a pilot restart of 100 regions without notifying anyone.

  112. Tristin Mikazuki says:

    The calander says we are going to have a rolling restart this morning is that still going to happen?

  113. Prospero Linden says:

    Tristin — sorry about not updating the calendar. We are still trying to track down the bug that leads to screwed up text in some estate ban lists. If we find that bug early enough today, we’ll do the roll Friday and Saturday mornings. If not, it will be postponed to next week. (Note that the reason we will do rolls as late as Saturday has a lot to do with my regular work schedule. Although, like many of us, I’ll sometimes do a bit on my “days off”, my normal days are Wed. through Sat.)

    Cappy @109 : every time we have a rolling restart, this comes up. Somebody suggests it was better when all regions were offline for 4 hours (if all went well, and in practice often for quite a bit more than that), back when we had regular Wednesday-morning downtime, than for each region to be restarted once and (for the vast majority of regions) only be down for 10 minutes. The fact is that even though there still is some disruption with this way of doing things, the disruption is a whole lot less. Downtime would not be a magic bullet that would make deploys go more smoothly; it would just be more time when every sim is offline.

  114. Argent Stonecutter says:

    Cappy@109: In the old days they still had to do rollbacks on occasion, and the instability after an update often lasted through the following weekend… with multiple grid-wide restarts during that period. And that was with a much smaller grid and many fewer concurrent users. I don’t think we want to go back there.

  115. Tristin Mikazuki says:

    But the wed down times did have some nice pluses to knowing the sims would be down only 1 day aweek made building/shopping way easer and it got people in the beta grid lol

  116. Dex says:

    I have tried to register for SL and found this is the only way to communicate with SL. An email was returned and support does not answer my question.

    I have registered and am still waiting for the confirmation email. It has been over 12 hours. Is this because of the restart?

  117. Renee Faulds says:

    Sounds like this one is about totally borked !

    So roll it back out on a ‘weekend’ – just great…..

  118. Alisha says:

    Dex,
    Take a look in your email “spam” folder, Your confirmation email might be in there. Also try re-reggin with a different email if possible.

    Not sure, but I seem to remember issues with certain emails(Hotmail, msn).

    Best of luck!

    Alisha

  119. Dex says:

    Not in Spam, I managed to login to support with a guest account and reported the problem. Really bummed about it though. Doesn’t make me look forward to the experience of SL when i can’t even register!

  120. Joshe Darkstone says:

    yea, not interested in a return to the past on this, just in a more thoughtful schedule. 1000 sims, leave em for a week if need be to be sure its ok, fix whats wrong when its not, then the grid rollout in 2 parts, then leave it be for a month at a time. that way the rollbacks will always just be for the 1000 that got it when something is wrong, rather then… oops, we fgured out we need to roll it back the day after the rollout disrupted the entire grid for 2 days.

    @Dex – try registering another account, it sometimes just gets stuck in the system. in case the email didnt come because of something on your side, create a google/yahoo/msn email account this time.

  121. Joshe Darkstone says:

    hrm, judging by the lull in sales today it *feels* like you are trying to slip that rollout by me anyway ๐Ÿ˜›

    something else going on thats affecting the grid?

  122. WhatsTheDealy Oh says:

    Cappy @109 I agree with you totally.
    What Prospero and others don’t realize is Businesses payed and worked hard for their traffic. Only to have their region restarted 2 times in one day . Not everyone had to go through it so it is an unfair advantage to those who did and a gain for those who didn’t.
    People bust their butts trying to get their traffic up with payed events and Payed advertising only to have their traffic ruined by double restarts on their regions( Money Well wasted).(And users thinking the region is unstable and undependable and don’t come back)
    This was not a grid wide affect which makes it a seriously unfair process. So back in the day when the grid was down for 4 hours, At least it was fair for everyone and everyone lost the same amount of traffic that they built. Now rolling restarts is a gamble weather which day your region gets borked.Everyone has to cancel events and sit around twiddling their thumbs wondering when this will happen and will it work?
    People trying to come up in the world only to get knocked down by failed roll outs . And not everyone was affected on the same day or same time. But hey who cares about that right? heh

  123. Argent Stonecutter says:

    Guess the rolling restart’s making it harder than usual to log in. I get a hang at “connecting to region” followed by an offer to teleport me home, and if I take that I get told I’ve been disconnected from the region I was in. Hope this is a temporary problem.

  124. Tristin Mikazuki says:

    Are you going to be doing the rolling restart today Saturday the 18th?
    Its on the calander.

  125. Dee Tunwarm says:

    Well I do hope this solves the server issues as SL servers have been inconsistent. But one wonders what the next problem will be …. assuming that the server issues are actually fully addressed.

  126. Pingback: Rolling Restart Mon-Wed 07/21-23 to deploy server version 1.23.2 « Official Second Life Blog

  127. Pingback: Rolling restart to deploy 1.23.3, Thu-Sat July 24-26 « Official Second Life Blog

  128. Pingback: Rolling Restart Mon-Wed 07/21-23 to deploy server version 1.23.2

  129. Pingback: [UPDATED] Rolling restart to deploy 1.23.3, Thu-Wed July 24-30

Comments are closed.