Email to scripts broken again, fixed again

Around 2 PM PST I deployed a change to our inbound email processing system to reduce database load. It broke delivery of outside email to scripts. Email would be silently dropped without a bounce message. Email sent from one scripted object to another should not have been affected. I fixed the problem around 8 PM PST. I’m sorry if I broke your scripts.

The problem was an error in our “mailglue” script that links email from sources outside the Second Life Grid to in-world scripts. It is not related to the postfix configuration problem yesterday. Frankly, it got lost in the shuffle of our network outage earlier today.

Also, you may notice that the reply address for offline IM to email has changed. This is expected – details are in last month’s blog post IM to e-mail return addresses changing. This change allows us to significantly reduce the database load related to IM processing.

Thanks for your patience as we work to improve Second Life’s performance. Thanks also to Temporal Mitra and the folks at rpgstats.com for their help in diagnosing this.

About James Linden

Developer, concentrating on client issue and user interface
This entry was posted in Announcements & News, Bugs & Fixes, Scripting, Service Status. Bookmark the permalink.

74 Responses to Email to scripts broken again, fixed again

  1. Valkyrie Eclipse says:

    Please be very careful of what you do nowadays – so much commerce-related activity is affected. Many have turned to outside solutions to get away from SL problems, so this sort of issue is a real concern.

  2. Taibah Takahe says:

    I can’t get logged on, this was the same problem as earlier, I thought the network issues were resolved.

  3. nimrod yaffle says:

    Everything is broken/breaking/getting broken.

  4. Karin says:

    I still cannot reply to an IM i just received in my email. I get : message undeliverable when replying.

  5. emelie Engawa says:

    I have been trying for two hours to log in….I get a message to try from a different location….tried that with the same results…..wasup??? Grid Status says it’s up!!

  6. RP85 says:

    I cannot loggon either…. ever since 10 PM central time… its 3 am :\

    Unable to connect to Second Life.
    Despite our best efforts, something unexpected has gone wrong.
    Please check http://www.secondlife.com/status and the Second Life Announcements forum to see if there is a known problem with the service.

  7. Usagi Musashi says:

    I still having problems.Are you sure they are fixed

  8. Tegg B says:

    I am having no problems maybe it’s just some people have something different to everyone else, SL works fine from the opposite side of the planet.

  9. Smiley Barry says:

    Well… This is one prob as i’ve kept a special filter in Gmail for IMs from friends not being cached – while others were being cached and made the Second Life label light up.

    But, i’ll get over it lol.

  10. neb svarog says:

    Everyone makes mistakes, even me on occasions πŸ™‚ That’s what QA is for. The beta grid is not a valid substitute for QA procedures.

  11. william Fish says:

    4 Karin
    i was getting that forawhile.. quick fix.. when you reply.. delete everything that’s in the reply including signatures… only send your short message. this should and has worked for me.

  12. Ravanne Sullivan says:

    Perhaps LL should have a staff meeting and discuss the meaning of the word RESOLVED. I does not appear that the definition is understood.

    When the same or similar issues continue to reappear you lose a lot of credibility. When you fix something be damn sure its fixed before declaring it RESOLVED and then DON”T BREAK IT AGAIN.

  13. Sylvia Sonoda says:

    I think it is very noble and impressive to write an entry in which an mistake is explained and “sorry” is said annd the comment section is left open. My Respect to James Linden for that.

    And thankyou for having a new place to offtopic but “live” shout our comments about things going wrong on the grid at the moment.
    Which at this moment for me is: Absolutely nothing is going wrong at the moment.

  14. Sirlor Stonecutter says:

    Uhm… Obviously if you never had it fixed in the first place, It never was fixed in the first place. Alot of citizens depend on email for vendors and commerce. Why don’t you take that into consideration when messin with “mailglue”

  15. Dirk Meriman says:

    I don’t think it’s fixed completely… I’m camping (yeah, I know) and the camp board wasn’t talking to anyone in world. Several others, myself included, kept clicking on this thing thinking maybe it was lag or something. About 20min later, I happen to check my email and I have 13 msgs from said camp board, each one containing several lines of text from the board. I don’t get emails from items or ppl in game unless I’m logged off. When I noticed this and mentioned it to the others around and they all got the same emails (some of them had dozens)

  16. Cenji Neutra says:

    I know you guys are overloaded, but if you setup JIRA to email the dev team when new SVC issues are created, you would have caught it earlier.

  17. Jillian Callahan says:

    So how about the XML-RPC and HTTP delays and time-outs? Are y’all working on that? Please? Pretty please with sugar atop?

  18. James Linden says:

    Just to be completely clear, “I made a change” means a group of us designed it, I wrote it, I tested it, QA tested it, I merged it into the release codebase, I tested it again, QA tested it again, and we deployed it to Second Life. I don’t get to make random changes to SL whenever I feel like it. πŸ™‚ I missed it because the thing that broke (mail to scripts) was not in the feature I changed (mail to IM). My code, my fault, mea culpa.

    Cenji – having Jira mail someone internally when issues are created is a good suggestion. We used to do this internally until the volume of Jira mail became overwhelming. However, the external Jira is smaller than our internal one – perhaps we could restart.

    Zero Linden may have more information about HTTP delays. We use the open-source libcurl internally for HTTP communication and I know we’ve discovered at least one bug in their code. We’ve patched it and submitted the patch to them, but I don’t know whether it’s been deployed to the grid yet.

  19. Ann Otoole says:

    looks like the colo network issues are back. same symptoms are cropping up all over. Guess they took the good hardware back off after the lindens stopped watching.

  20. Tegg B says:

    No problems James, don’t understand what was broke but glad to see it fixed, stuff happens, particulalry like you say when it wasn’t directly related to what you were doing, so hopefully the wolves don’t chew you up to much, good to see they seem to have lost their teeth when someone says “Sorry” πŸ™‚

  21. Dekka Raymaker says:

    Just a note, i hope it’s relevant to this topic, I get IM’s emailed to me although this option is unchecked in my preferences.

  22. Ann Otoole says:

    sl seems to be totally down now. absent. cant get past verifying protocal version on login screen. yet i see nothing on the blog about it. hmmm.

  23. janeforyou Barbara says:

    Gird offline—yepp. and its weekend

  24. janeforyou Barbara says:

    Total Residents: 4,025,450
    Logged In Last 60 Days: 1,361,704
    Online Now: 0
    US$ Spent Last 24h: $1,645,845
    LindeX Activity Last 24h: $242,756

  25. Ann Otoole says:

    yea feels like yet another major issue with the network equipment at the colo. Linden Labs needs to dump the current colo and sue them. Find a colo that has decent equipment. And a staff that monitors for degradation. The QoS at the existing colo is nonexistent.

  26. Tater Todd says:

    Switch to the mainline client if you’re trying to login with First Look

  27. Gala Alva says:

    well well well.. almost 34K, and no way of logging in… once and the boom headshot! off world!

    brava LL… you guys rock!

  28. janeforyou Barbara says:

    Topic or not topick–The main issue are math: in Mars 2008 there will be 25 mill nicks and there are and have been 0.85% of the nicks online at prime time if you go back to August 2006-0.85% off 25 mill = bout 215.000 active useres online even more i bet, in just a few weeks there will be 50-60k users online-Ruoters and servers and network must be able to handle that, not in 4 weeks time, but must be on place next week.Time goes fast in here πŸ™‚

  29. Gdon Grizot says:

    Total Residents: 4,031,269
    Logged In Last 60 Days: 1,361,704
    Online Now: 34,759
    US$ Spent Last 24h: $1,635,494
    LindeX Activity Last 24h: $239,559
    WOW Online Now: 34,759 :-0

  30. Gala Alva says:

    fix the dam server and stop censoring our messages! wth! update the blog and let us SPEAK!

  31. william Fish says:

    ok this is really frustrating. im one of the many owners of an xploder. for those that dont know what it does.. people pay into it.. pot grows.. it xplodes and pays people sums of the pot.. i can get about 30-40 xplosions a day.

    Lately especially right now, im getting stale transactions… means the pot didnt pay the people who paid into it… means i have to go back into my chat history and find who didnt get paid.

    I can’t rely on the client’s account history tab… it’s delays about 6 hours!!!! i can’t rely on the transaction log page.. it’s delayed about 13 hours…. so when im not around to catch the chat from the xploder, i have no way to verify if someone got paid when they im me. THIS IS FRUSTRATING!

    LL i have sent many bugs today, so have my customers for stale transactions. YET there’s no blog report or BUG report about it.

    WHAT IS GOING TO BE DONE ABOUT OUR ACCOUNT LOGS? Just like every buisness in SL and RL we rely on our transactions to make SL move! I guess we all could just close up shop… but we cant and wont.

    Please hear our cries please with the love of Second Life and the TAO “chose our own work” chose this project and get it fixed. At this point i can live with an hour delay… please

  32. Sylvia Sonoda says:

    Can anyone tell me what this is? I see this little text on the main Blog page but not an entry in the forum. Is this new ways of saying things or am I just blind for not seeing this way of posting before?
    ——-Left columm main blog page——
    Β» [Feb. 25 @ 12:30 PM PST] Logins may be slow, teleporting may occasionally not work, and some L$ transactions may be failing. We’re monitoring the situation.
    ————————————————

  33. Gala Alva says:

    @william..

    wow that must be beyond frustrating! i sometimes go to a very well known casino for freeplay and reg play, and it eats our L’s up like a dry sponge. you loose about 10 – 15L in their server hole. its so frustrating not just for the players but the owner himself who just upped to a class 5 server and cant do nothing about LL’s broken pipelines.

    its downright frustrating.

  34. william Fish says:

    Feb. 25 @ 12:30 PM PST] Logins may be slow, teleporting may occasionally not work, and some L$ transactions may be failing. We’re monitoring the situation.

    ok i didnt see this… perhaps it’s cause im IN WORLD.. LL please post the blog first not the login. and monitoring is probably not the best choice of words….. how bout FIXing the problem… regardless if you are or not… it’s the best word to use! Monitor just means that you really dont believe our bug reports and want to see for yourself if it’s true or not. This has been ongoing since around 3am.

  35. Sylvia Sonoda says:

    β€”β€”-Left columm main blog pageβ€”β€”
    Β» [Feb. 25 @ 12:30 PM PST] Logins may be slow, teleporting may occasionally not work, and some L$ transactions may be failing. We’re monitoring the situation.
    β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”β€”

    Well almost 36.000 residents in-world at the moment. I think thats a record we do not wan to see. How about the restrictions Linden had announced?

  36. william Fish says:

    sylvia thanks for that i had no clue there was that many damn.. when does the restrictions go into effect? when we hit 40k? 50k?

  37. Ann Otoole says:

    Total Residents: 4,033,865
    Logged In Last 60 Days: 1,361,704
    Online Now: 36,046
    US$ Spent Last 24h: $1,739,863
    LindeX Activity Last 24h: $235,128

    sl has beat the 36K number

  38. Sylvia Sonoda says:

    I have no idea at what number Linden starts the restriction but above about 27.000 I stop rezzing, creating, chaniging clothes and teleports anyway because I am too scared i will loose things. Mainly above 27.000 I just sit at home and IM with friends like a normal chatbox and read and post out-world.

  39. Sylvia Sonoda says:

    I guess the bugs are now really starting because now the secondlife.com is hard to get. Mostly thats because lots of people start checking it after things going badly wrong. In a way thats rather late and maybe this would mean Linden was able to keep things going even beyond the 36.000 for a while. Stil not applying the restrictions earlier in the proces means Linden is still waiting for trouble before acting and thats my biggest critic towards them.

  40. Gdon Grizot says:

    Total Residents: 4,033,865
    Logged In Last 60 Days: 1,361,704
    Online Now: 36,077
    US$ Spent Last 24h: $1,739,893
    LindeX Activity Last 24h: $236,716
    [Feb. 25 @ 1:20 PM PST] We’re aware of inworld issues like slow logins, teleports failing, and L$ transactions not going through. We’re monitoring the situation.

  41. Talthybius Brevity says:

    Over 36,000 concurrency now, and 34,000 concurrency was cited as the straw that broke the camels back on created a restricted login contingency plan.

    If restricted logins are not invoked by the current situation, one begins to wonder if the contingency plan has actually been developed or if it is vaporware.

  42. Ann Otoole says:

    the contingency plan is for when there are issues they can troubleshoot.
    system functioning as designed right now.
    nothing to fix.
    eventually people will go away and it will return to normal.

    sysm is not designed for more than 20K users at once.
    is not scaleable.

    so the answer is a premium grid seperate from the main grid.
    all the rich folk can go be whitey shiny in the white grid and leave all us duirty folk in the po grid.

    odds are good the white gridd will be lame and full of people waiting for peiople to buy their expensive stuff.
    unfortunately most of the business in sl comes from the riffraff.

    grow up and whine about the contingency plan to the wall.

  43. Sylvia Sonoda says:

    It would be nice if for once the blog entries would not degenrate to the discussion of “which group which” again.

  44. Talthybius Brevity says:

    @ Ann Otoole “the contingency plan is for when there are issues they can troubleshoot.”

    Umm, no… quoting from the contingency plan announcement:

    http://blog.secondlife.com/2007/02/16/contingency-measures-to-ensure-service-as-second-life-grows/#more-777
    “When the Grid is under stress, resulting in content loss and a generally poor experience, we would like to have an option less disruptive than bringing the whole Grid down. So we’ve developed a contingency plan to manage log-ins to the Grid when, in our judgment, the risk of content loss begins to outweigh the value of higher concurrency. Looking at the concurrency levels, it’s clear heaviest use is on the weekends.”

  45. william Fish says:

    39 Sylvia Sonoda

    “Stil not applying the restrictions earlier in the proces means Linden is still waiting for trouble before acting and thats my biggest critic towards them.”

    I wonder, what kind of trouble? isnt the ability to hold a data base and relay it in any means what so ever trouble? TP to and from sims and crash.. isnt that trouble? hair and shoes up the arse.. isnt that trouble? not being able to pay anything due to stale transactions… isnt that trouble? please define trouble… cause i think we all saw pleanty of trouble today

    just trying to make a point it’s long past due.. not trying to degrade your comment πŸ™‚

  46. william Fish says:

    [Feb. 25 @ 1:20 PM PST] We’re aware of inworld issues like slow logins, teleports failing, and L$ transactions not going through. We’re monitoring the situation.

    PERFECT AND YET NO BLOG REPORT>. so in order to figure out what’s going on i have to log off, log back in and read the log in screen.. makes sense.

  47. william Fish says:

    44 Talthybius Brevity

    yup that’s what i read too…. call me stupid and slap me silly.. but i think it’s under stress….

    and Ann there’s no need for that here.. im a paid customer that like many mall owners here … have free public yard sale lots for “poor riffraff and white gold lovers” alike to sell their items for free.

  48. Gdon Grizot says:

    [Feb. 25 @ 2:40 PM PST] We’re currently monitoring known inworld issues like teleports failing, L$ transactions not going through, and slow logins.
    Total Residents: 4,036,325
    Logged In Last 60 Days: 1,361,704
    Online Now: 34,384
    US$ Spent Last 24h: $1,744,916
    LindeX Activity Last 24h: $236,795

    NO william Fish Says: so in order to figure out what’s going on i have to log off, log back in and read the log in screen.. makes sense.

    On this web site there is information. It would be nice if they start a new Blog?

  49. william Fish says:

    Β» [Feb. 25 @ 3:00 PM PST] We’re currently monitoring known inworld issues like teleports failing, L$ transactions not going through, and slow logins.

    ok it’s not on the main page which btw took 2 mins to load.. happens… but it is on the top left of this and every blog post… so call me stupid cause i didnt look πŸ™‚ im game and i do appoligize.

    could we have an inworld notice? ok ok ok im done πŸ˜›

  50. Dano Fugazi says:

    maybe work on getting a windows vista version of SL up and running? LL should of had a release by now. What the heck is going on???

  51. Grow says:

    Is this the william Fish blog of summin? hehe

    Anyway…. It’s Sunday guys, we all know what that means πŸ˜‰ I havent bothered trying to log in for the past 5-6 weeks, (at this time on a Sunday) as theres really no point in trying. If I wanted a lagged, resouce heavy IM program, I’d install AOL’s junk πŸ˜›

    I’ll wait for atleast another 3hours before trying to log in.. Or atleast till the amount of online users (more than likely half of them are ghosts anyway) drop to around 25k. Not that the grid can handle even 25k effectivly but least IM and chat wont be lagged, and possibly I’ll be able to open my invo and change clothes then.. πŸ˜€

  52. Tegg B says:

    “Dano Fugazi Says:
    maybe work on getting a windows vista version of SL up and running? LL should of had a release by now. What the heck is going on???”

    Hmm that won’t help, most of us don’t have Vista, And seeing XP works fine , I’ll spend my $ elsewhere.
    Blame you new pal Bill, he’s the one who decided to create a non retro compatible operating system πŸ˜›

  53. Quantum Daikon says:

    Re. Vista.

    To some degree it really depends on your configuration. The current SL viewer runs fine without any issues on my Vista 64bit – core 2 system with a 7600GT card. So the only hold up should be the SL vid card support for Vista. Vista itself should run SL, it just depends on the particular hardware set-up. (Just for the record – Vista is pretty but not worth upgrading from XP)

    Quant

  54. Melissavp Islander says:

    From todays comunity page:
    Message of the Day:
    In our continuing effort to improve the Second Life experience, we will be updating to version 1.13.2 on Jan. 17th from 7am – 12pm PST (15:00-20:00 GMT). During this time Second Life will be unavailable. Please visit our Official Blog for more details.

    January????

    come on guys, we are nearly in March…..

    I know S… happens…. what about some QA control… (or is it my browser…)

    Oh, and yes leave an open blog allways so we can feed-back to you without been off topic…
    and yes dont bother to let it post….
    message is only for you….

    Mel

  55. Imogen Saltair says:

    Mel @ 54

    “Oh, and yes leave an open blog allways so we can feed-back to you without been off topic…”

    I have been looking for a more recent thread to post to than this one, and have come here to suggest what Mel suggests…. if the comments are going to be closed on any topics from LL for two days, what else can one do but be off topic?

    One place on the blog where customers may report problems.. please! At least we see that on average once in a comments section there is one yellow post that means somebody reads it. if you feel it necessary, make it strictly limited to a list of bugs/faults/difficulties with no positive or negative comments, refuse to post any containing flames/rants/whining if you like… and ANSWER THEM… something as simple as

    issue 1 – fixing
    issue 2 – next update will fix
    issue 3 – not an LL problem
    etc.

    make it daily issues (current only) and clear it daily, and request that if we see our problem already reported by another user we look for the reply to that user and not repost it.

    this way we would at least feel that we have some allowed place to feedback and some response to our attempts to report genuine issues which you may not be aware of or think are fixed.

    Thank you… Second Life rocks.. help us to help you to make it better. A blank wall is like a slap in the face.

    Imogen

  56. Lulu Flasheart says:

    Off topic – like the website transaction page is lagging again.

    “Another” database issue that they are not fixing

  57. Bartiloux Desmoulins says:

    For nearly a week now I am still unable to reply to offline IM’s I receive in my email mail box. Even if I reply to an offline IM within moments of having received it, I receive a reply saying…

    Command died with status 100: “/opt/linden/indra/tools/mailglue/mailglue –grid=agni –system=im”. Command output: This email to instant message session has expired.

    I rely on the ability to respond to offline IM’s to help me manage my business when I am unable to be in-game and this issue is a major inconvenience to say the least.

    Are there others out there having this problem?

    Warm regards,
    Bartiloux Desmoulins

  58. william Fish says:

    57 Bartiloux Desmoulins

    not sure if its worth looking into or not but when you reply to email from an im in world, on the reply email take out everything… then type in your comment.

    i use to get those emails bouncing back to me cause my signature which has weblinks(dont know if that’s whats causing it) so i just deleted everything then wrote… never had a prob since.

    maybe it works maybe it doesnt worth a try.

    STALE TRANSACTIONS upon money payout to those that won money from me once again… ugh… i dont have a way to varifiy this since this time i was offline when it happened. Sure i could wait 6-10 hours until the transaction webpage shows the data… or the account history log in world which was delayed up to 6 hours yesterday as well…. or i could swallow my loses and pay the customer… after all customer is always right right?

    Im still learning alot how this system works. Is the internal ticket thing apart of this? I’d post this on that new blog but comments are off.

  59. janeforyou Barbara says:

    Β» [Feb. 26 @ 1:16 PM PST] We’re aware of current problems inworld with group IM sessions failing and logins being slow/not working. We’ll update you as we have more info.

  60. william Fish says:

    THIS IS turning into a no comment blog page.. :((

    i just wanted to express my thanks to Torley Tester (whoever that maybe) for letting us know IN WORLD. I am having a hard time explaining to my customers why the stale transactions. This is much needed (the inworld notice) so thank you very much!

  61. lolo says:

    I swear email script causes more headaches than any other.

  62. Ann Otoole says:

    sl is down again. only 21K online. no connect to login server. hopefully someone is watching this blog since it is the only way to let you know.

  63. william Fish says:

    ann…

    me too. i can’t not even send a crash report.. just stays stale with “transmitting data”

    now why did i log off again? oh yeah cause SL was being laggy. lol

  64. janie matahari says:

    I cant login!! My roasting pig disappeared and i have a luau to host!

    :(( (pshycological damage)

  65. william Fish says:

    btw in that same ten mins i watch as there were 22.340 online to now around 18400… is it logging people off too?

  66. Ann Otoole says:

    hundreds of users dropping per minute now. this is probably the worst failure i have seen. sl is seriously going down. doesnt look good since it is after hours. maybe tomorrow they will wake up and realize the server farm died completely and finally invest in a failover system.

  67. william Fish says:

    was there maintence scheduled for 7pm to 9pm today? πŸ˜› jk i know when it was scheduled.

  68. william Fish says:

    i wonder if there’s any lindens in world right now seeing all of this… wonder how they gonna find out since we can’t even send crash reports.

  69. Marhjan Yamdev says:

    please fix the grid – this is the only place we can comment right now – only about 17,000 users on – that’s obviously not good. Wake up Lindens!!

  70. william Fish says:

    ok im in now… just a hiccup that kicked nearly 10,000 offline. there’s 17800 on now. took 5 mins to log in when it normaly takes 1-2 mins.

  71. G W says:

    too many peeps over reacting – this is not even close to the worst failure i have seen – just a blip – there are linden employess 24/7 – in fact i just logged in

  72. Ann Otoole says:

    oh dang that one felt so much like a “whoops kicked the plug out of the wall” f’up rofl

  73. Ann Otoole says:

    looking forward to some improvements wednesday. what i am reading sounds pretty darn good.

  74. it looks like a nice site, thanks..

Comments are closed.