Записи о веб-разработке, рекомендованные Артемием Трегубенко

Яндекс поддерживает микроформат hCard

2010-03-31T18:59:02Z

Сегодня мы начинаем поддерживать микроформат hCard, предназначенный для разметки контактных данных. Эти данные используются для пополнения нашего справочника организаций и в дальнейшем отображаются на Яндекс.картах и странице результатов поиска.

Flash Player будет интегрирован в Google Chrome

2010-03-31T07:29:24Z

Последний девелоперский билд (5.0.360.4 для Windows и Mac, 5.0.360.5 для Linux) уже включает в себя Adobe Flash Player версии 10.1.51.95 (10.1 beta 3). Для его активации необходимо прописать --enable-internal-flash в ярлыке, либо в командной строке.

З.Ы. Ещё на dev ветке в about:plugins добавили менеджер плагинов (так что теперь любой из них можно отключить).

OpenSSL 1.0.0 released

2010-03-29T20:12:14Z

yaroslav

OpenSSL 1.0.0 release, http://www.openssl.org/news/ 29 Mar 2010 (started 23 Dec 1998). 11+ years in the making

5 hours ago from Twitter - Comment - Like

Gluek, arty, dreikanter and 3 other people liked this

Зачем таким продуктам вообще нужна версия 1.0? Их можно нумеровать от 1 до ∞ и праздновать встречающиеся по пути простые числа :) - Alexey Melchakov

Jeremy Zawodny: pwnat - NAT to NAT client-server communication

2010-03-29T04:09:07Z

Jeremy Zawodny

pwnat - NAT to NAT client-server communication - http://samy.pl/pwnat/

March 28 from delicious - Comment - Like

Isaac Hepworth, 9000, Алексей Капранов and 2 other people liked this

very interesting... - Jeremy Zawodny

a brilliant time-exceeded hack. - 9000

SQL Injection

2010-03-25T10:55:14Z

Уже сравнительно давно появились технические средства, которые позволяют определить скорость автомобиля и распознать его скорость (не автомобилист, не знаю, как они у вас называются). Соответственно, они должны использовать базы данных. А где есть базы данных, там можно попытаться атаковать их с помощью SQL Injection, например! Крупнее

Q: What's the difference between Ant and Maven? A: The creator of Ant has apologized.

2010-03-25T10:17:28Z

Ashalynd

:-D RT @johlrogge: RT @technomancy Q: What's the difference between Ant and Maven? A: The creator of Ant has apologized.

9 hours ago from Twitter - Comment - Like

arty and Алексей Капранов liked this

Firefox stops development for Windows Mobile

2010-03-24T14:32:34Z

Stopping Development for Windows Mobile:

Microsoft has unfortunately decided to close off development to native applications. Because of this, we won’t be able to provide Firefox for Windows Phone 7 at this time.

Video on the Web - Dive Into HTML5

2010-03-24T08:57:35Z

Shared by arty
хорошее введение в тему

Video on the Web—Dive Into HTML5. Everything a web developer needs to know about video containers, video codecs, adio containers, audio codecs, h.264, theora, vorbis, licensing, encoding, batch encoding and the html5 video element.

Internet Explorer / [Ссылка] Amazon.com прекращает поддержку IE6

2010-03-23T07:28:51Z

Уже и они!

Peer-to-peer connections

2010-03-22T21:42:32Z

After the <device> element (see What’s Next in HTML, episode 1) Ian sketched out an interface in the WHATWG HTML draft that builds on that, which looks quite interesting. Peer-to-peer connections (URL bound to change):

[NoInterfaceObject]
interface AbstractPeer {
  void sendText(in DOMString text);
  attribute Function ontext; // receiving

  void sendBitmap(in HTMLImageElement image);
  attribute Function onbitmap; // receiving

  void sendFile(in File file);
  attribute Function onfile; // receiving

  attribute Stream localStream; // video/audio to send
  readonly attribute Stream remoteStream; // video/audio from remote peer
  attribute Function onstreamchange; // when the remote peer changes whether the video is being sent or not

  attribute Function onconnect;
  attribute Function onerror;
  attribute Function ondisconnect;
};

[Constructor(in DOMString serverConfiguration)]
interface PeerToPeerServer : AbstractPeer {
  void getClientConfiguration(in PeerToPeerConfigurationCallback callback);

  void close(); // disconnects and stops listening
};

[Constructor]
interface PeerToPeerClient : AbstractPeer {
  void addConfiguration(in DOMString configuration);
  void close(); // disconnects
};

[Callback=FunctionOnly, NoInterfaceObject]
interface PeerToPeerConfigurationCallback {
  void handleEvent(in PeerToPeerServer server, in DOMString configuration);
};

You will still need some kind of intermediary (i.e. a server in almost all practical scenarios) to exchange the address, but after that things can get pretty interesting I think. I was hoping people would be willing to share their thoughts on the interface sketch above and the general idea of having access to peer-to-peer connections from Web pages and the Web platform in general.

Поисковые машины и технологии / Автоматический парсинг товаров с сайтов магазинов

2010-03-22T14:14:28Z

Мы все время от времени покупаем какие-либо товары. Перед покупкой чего-нибудь более-менее дорогого или редкого, мы, как правило, сперва ищем товар в интернете. Сейчас на рынке поиска товаров доминируют Яндекс.Маркет и Price.ru, однако большая часть их баз данных — товары из интернет-магазинов. Интернет-магазины — замечательное изобретение, но зачастую требуется потрогать товар руками перед покупкой, сравнить с другими товарами, подумать… Возникает необходимость найти оффлайновый магазин с искомым товаром и желательно как можно ближе (учитывая московские проблемы с транспортом) и дешевле. Периодически сталкиваясь с этой проблемой, я пришел к мысли, что неплохо бы создать удобный сервис, способный осуществлять поиск по всем возможным каталогам и прайс-листам, доступным в интернете, и выводить эту информацию наглядно на карте города.

Основная задача при создании такого сервиса — как собрать всю эту информацию и вовремя обновлять. Просить у магазинов предоставить данные в своем формате, как это делают Price.ru и Я.М, — крайне затратный и медленный путь. Я пришел к единственному возможному и разумному варианту — индексация сайтов магазинов в автоматическом режиме. Как вы понимаете, задача совсем не тривиальная…

Читать дальше →

Opera: The media=all of the industry!

2010-03-22T11:38:18Z

I think Molly just came up with a new slogan for Opera:

"Opera: The media=all of the industry!"

Originally posted by Molly:

Opera on your desktop, your phone, in your truck, on your Wii and coming soon: television. We are the media="all" of the industry.

For some background, look up CSS Media Types: "all - Suitable for all devices."

Since Opera probably runs on more types of platforms, devices and operating systems than any other browser, this slogan seems fitting for Opera. Geeky, but fitting.

Computer Code as a Medium for Human Communication: Are Programming Languages Improving?

2010-03-21T02:34:04Z

http://www.google.com/search?q=pdf+Are+Programming+Languages+Improving

Computer Code as a Medium for Human Communication:

Are Programming Languages Improving?

Gilles Dubochet, 2009

В спорах по поводу языков программирования часто встречается аргумент про "поняность" или "непонятность" кода. Подразумевается, что если код более понятен, то его легче сопровождать и развивать.

Важной вариацией на тему понятности кода является вопрос о том, легче ли понимать более короткий код, или краткость кода может предъявлять слишком много требований к читающему. В своём крайнем выражении аргумент звучит так: что лучше, сидеть и целый час корпеть, пытаясь понять, что делают эти компактные пять строчек программы, или же это час можно затратить на то, чтобы не спеша и не напрягаясь разобрать два-три экрана эквивалентного "несжатого" кода.

Разумеется, когда мы говорим о кратком коде, мы не имеем ввиду синтаксическую краткость — отсутствие лишних пробелов, переводов строк, упаковывание повторяющихся последовательностей операций в текстовые макросы, etc. Краткость, которую имеет смысл рассматривать, достигается использованием абстракций более высокого уровня. Например, использование свёрток списков или деревьев вместо ручного их обхода.

Аргумент от Java/C++ программистов — использование слегка более многословного кода позволяет быстрее и эффективнее донести суть алгоритма до читателя кода. Реакция Java программистов на Haskell-код — этот ваш хаскель сложно понимать, ибо в нём нездоровое количество функциональности на единицу текста.

Аргумент от программистов на языках, допускающих написание существенно более коротких программ — короткий код позволяет человеку выражать более высокоуровневые концепции, наверняка более приближенные к предметной области, и поэтому понимание кода происходит быстрее и эффективнее. Реакция Python программистов на Java код — за деревьями не видно леса: большое количество церемоний вокруг простейших вещей напускает ненужный туман на ту мысль, которую программист пытался этим кодом донести.

В итоге, как всегда, каждый остаётся при своём мнении, ибо без каких-либо эмпирических данных противоположные аргументы выглядят одинаково правдоподобно.

Gilles Dubochet в своей статье принимает точку зрения, что код является механизмом обмена информацией — моделями, пониманием предметной области — между людьми. Базируясь на этом, он конструирует две гипотезы, и организует эксперимент, чтобы проверить состоятельность каждой из них.

Гипотеза номер один. Понимание кода улучшается, когда программист читает более компактный код, разумеется, если у программиста и писателя есть более-менее общее понимание предметной области. Возможный вывод из этой гипотезы, если она подтверждается — если программисты знают, что они делают, то компактный код позволяет им понимать более компактный код эффективнее менее компактного.

Гипотеза номер два. Намёки на модель предметной области, выраженные в типах, именах переменных, и т. п., позволяют привязаться к модели предметной области. Например, понятие объединения из реляционной алгебры может быть выражено через функцию по имени join, а может быть через функцию с похожим названием (enhance_with?), но не имеющую названия, взятого напрямую из предметной области.

Gilles дал некоторому количеству испытуемых код на Scala, по-разному реализующий некий набор алгоритмов из реляционной алгебры. Вот такие три варианта были разработанны:

S/G	D/U	D/G
Sparse/Grounded	Dense/Ungrounded	Dense/Grounded
Код на Scala, использующий только конструкции, доступные в Java. Этот код имеет привязки к реляционной алгебре с помощью имён переменных, функций, типов, etc.	Компактный код на Scala, использующий более высокоуровневые конструкции, доступные в Scala. Код не имеет явных привязок к реляционной алгебре.	Компактный код на Scala, имеющий привязки к реляционной алгебре

Читателям кода давались на ознакомление концепции реляционной алгебры, а затем следовала просьба изучить предлагаемые варианты кода, реализующие некоторые алгоритмы из неё: natural join, left outer join, cartesian product. После вникания в код (обычно испытуемые справлялись с этим за 45-60 минут), испытуемые должны были написать "контракты", которым следует код: списки предусловий и постусловий, которым отвечает логика алгоритмов. Этим проверялась степень понимания кода. Параллельно использовался специальный девайс для слежения за глазами, чтобы понять, какие языковые конструкции привлекают максимум внимания на этапе изучения кода.


Рис. 3. Для каждого стиля программирования показано нормализованное время, затраченное на чтение кода алгоритма.	Рис. 4. Для каждого стиля программирования показаны среднее нормализованное время, затраченное на рассматривание одного лексического токена в программе, независимо от его природы.

Рис. 9. Снимок экрана с кодом, как он был показан испытуемым во время эксперимента, с картой распределения концентрации внимания. Алгоритм слева — S/G, читаемый испытуемым 7, алгоритм справа — D/U, читаемый испытуемым 12. Обратите внимание на то, что фиксация внимания на именах идентификаторов, наблюдаемая для кода в стиле S/G, в коде стиля D/U отсутствует.

Статья даёт развёрнутую интерпретацию полученных экспериментальных результатов. Попробую дать выжимку. По поводу состоятельности первой гипотезы Gilles замечает следующее:

Экспериментальные результаты показывают, что код на Scala, написанный с использованием продвинутых, абстрактных конструкций, лучше чем код, написанный в стиле, схожем с Java. Разница во времени понимания испытуемыми материала статистически значима, несмотря на малый размер выборки. Касательно достигаемого степени понимания, неформальные замечания, сделанные испытуемыми, дают субъективное подтверждение этому — использование продвинутых конструкций упрощает задачу понимания кода. Интересно отметить, что преимущества использования Scala были видны даже в группе, состоящей из программистов с ограниченным пониманием общих концепций Scala.
[…]
Неожиданностью оказалось наблюдение, что время понимания смысла токена не отличалось, несмотря на разный когнитивное содержание токена. (Шрифт мой — lionet). Если это свойство может быть обобщено, это дало бы дизайнерам языков точную цель: чем короче код, тем лучше. Это также может объяснить, почему языки предметной области (DSL) так эффективны.

По поводу второй гипотезы (привязка к доменной области упрощает восприятие кода) эксперименты не смогли ни подтвердить, ни опровергнуть этого. Испытуемые, которым требовалось создать концептуальную модель того, что происходит, тратили больше времени на чтение привязок к доменной области (имён переменных, etc), чем те, которые составляли более техническую модель кода. В целом можно сказать, что поставленный эксперимент не очень подходил для того, чтобы иметь возможность установить состоятельность второй гипотезы.

Исследование показало, что 90% посетителей сайта www.mozilla.org используют…

2010-03-19T06:45:03Z

Исследование показало, что 90% посетителей сайта www.mozilla.org используют браузер Internet Explorer.

Сокращатели ссылок тормозят Интернет

2010-03-18T13:00:43Z

Самым главным «тормозом» был признан fb.me – сервис социальной сети Facebook. Для переадресацию на страницу ему требуется целых 2 секунды.

Alex Russell: If HTML is just another bytecode container and…

2010-03-18T06:54:30Z

Shared by arty
кратко: раньше яваскрипту можно было учиться на примерах реального веба, а теперь HTML — контейнер байткода, типа SWF, и может им проиграть

If HTML is just another bytecode container and rendering runtime, we’ll have lost part of what made the web special, and I’m afraid HTML will lose to other formats by willingly giving up its differentiators and playing on their turf.

- Alex Russell

Microsoft's IE9 standards tests vs. reality

2010-03-17T13:22:07Z

The good news is that Internet Explorer 9 supports SVG!

The bad news is that their standards support table could mislead people into thinking that IE9 is more standards compliant than other browsers.

Since I'm already talking about SVG, let's look at that as an example. If you read the description on their page, you will notice that it doesn't actually show SVG compliance as such. It shows how each browser does when running the 31 tests Microsoft created, when even the SVG 1.1 Tiny test suite has more than 150 different tests.

CodeDread has a published list detailing SVG support in different browsers. As you can see, IE9 still does poorly compared to other browsers. So while Microsoft's own page would give you the impression that IE9 has excellent SVG support, that is not the reality.

It's great that IE9 will support SVG, but I think Microsoft's page is rather misleading. Let's hope they are planning to make use of the full test suite at some point.

Огненный лис / [Перевод] Расширение «Context Font» («Шрифт контекста»)

2010-03-17T09:27:37Z

расширение · github

WidgetSpecs

2010-03-15T21:16:27Z

WidgetSpecs:

The document enumerates the set of the specifications that constitute the Web Application WG’s Widgets Family of Specifications.

Яндекс: Хостинг Javascript-библиотек

2010-03-15T17:59:49Z

Shared by arty
о, теперь и яндекс

Сегодня мы запустили новый сервис для веб-разработчиков — хостинг популярных Javascript-библиотек на серверах Яндекса.

Используя загрузку библиотек из CDN Яндекса, вы получаете следующие преимущества:

Снижается нагрузка на ваш сервер.
Браузеры, следуя рекомендациям спецификации HTTP 1.1, обычно устанавливают не более 2 одновременных соединений с одним хостом (в современных браузерах — 6). Библиотеки загружаются с домена Яндекса, поэтому не блокируют загрузку данных с вашего домена.
Правильное кэширование и использование gzip.
Если пользователь уже посещал какой-либо сайт, который использует библиотеки Яндекса, ему не надо будет заново загружать файлы на вашем сайте — они сохраняются в кэше.
При использовании нашего загрузчика вы получаете неблокирующую загрузку JS и Яндекс.Метрику в подарок.
Каждая библиотека доступна как в сжатом виде, так и в стандартном (версия для разработки). Например, последнюю версию jQuery можно загрузить по такому адресу: http://yandex.st/jquery/1.4.2/jquery.min.js

Мы будем размещать свежие стабильные версии библиотек сразу после их выхода, старые версии будут сохраняться на неограниченный срок.

Новости проекта будут публиковаться в нашем клубе, там же вы можете задать вопросы и оставить отзывы.

Алексей Андросов и Леонид Хачатуров, ускоряем интернет

Вам, разработчики.

2010-03-15T16:27:04Z

Инструменты для разработки.

удобная работа с классами

2010-03-15T07:14:25Z

в спеке html5 есть удобный интерфейс classList для работы с классами (и другими подобными строками из разделённых пробелами слов). Естественно, он базируется на том, что яваскрипт-библиотеки давно уже сделали удобным и привычным, поэтому переходник сделать очень легко в качестве развлечения на пять минут:

Element.addMethods({
  getClassList: function(element) {
    element = $(element);
    return element.classList || (element.classList = {
      has:    attach('has'),
      add:    attach('add'),
      remove: attach('remove'),
      toggle: attach('toggle')
    });
    function attach(name) {
      return element[name + 'ClassName'].bind(element);
    }
  }
});

если же очень горит максимально приблизиться к спецификации даже в той части, которую никто никогда не использует, то можно добавить ещё пару методов:

  item:   function(index){ return element.classNames()[index]; },
  length: function(){ return element.classNames().length; },

в целом, конечно, это очень похоже на переходник к не менее удобному dataset, который я делал пару лет назад.

YUI Theater — Douglas Crockford: “Crockford on JavaScript — Episode IV: The Metamorphosis of Ajax” (93 min.)

2010-03-13T10:58:29Z

Shared by arty
вот чего у Крокфорда не отнять, так это таланта увлекательно ругать существующие технологии, и отстранённо смотреть на их историю : ) Хотя, конечно, не всему нужно слепо верить. Если кто не любит видео, вот транскрипт: http://developer.yahoo.com/yui/theater/video.php?v=crockonjs-4

Last week, Yahoo! JavaScript architect Douglas Crockford delivered the fourth installment of his Crockford on JavaScript series:

Volume One: The Early Years
Chapter 2: And Then There Was JavaScript
Act III: Function the Ultimate
Episode IV: The Metamorphosis of Ajax
Part V: The End of All Things (March 31 — RSVP)

In this session, Douglas tackles the DOM. On the one hand there was JavaScript, he says, and JavaScript is “what made the browser work.”

On the other hand, there was the Document Object Model, also known affectionately as the DOM. It is what most people hate when they say they hate JavaScript. Most of the people who say they hate JavaScript don’t know JavaScript, might have never seen JavaScript, but they’ve felt the DOM alright. If you don’t know what the difference is and you say, “JavaScript is the stupidest thing I’ve ever seen,” you’re not talking about JavaScript, you’re talking about the DOM. The DOM is the browser’s API. It is the interface. It provides JavaScript for manipulating documents.

The DOM may be imperfect, but it’s nonetheless crucial to what frontend engineers do when they write web applications. In this talk, Douglas provides an overview, situated historically, of where the DOM came from, how it achieved ascendance with Ajax, and what the future might hold. In Douglas’s inimitable fashion, this history starts with Sir John Harrington and takes us up to the present day. A few choice words for CSS are among the many applause lines for veteran developers:

I find within the community of people who use CSS great affection for it. They’re totally invested in CSS, they love it. They can’t imagine any other way of doing formatting in a document. It’s it. It’s sort of like watching an episode of Cops where the cops come in and break up the family dispute, and there’s this “CSS ain’t bad, you just don’t understand it like I do. I know it hurts me, but I make mistakes, I’m wrong.” CSS is awful, and it amazes me the way people get invested in it. It’s like once you figure it out, kind of go “oh, OK, I see how I might be able to make it work,” then you flip from hating it to loving it, and despising anybody who hasn’t gone through what you’ve gone through. It doesn’t make sense to me.

If the video embed below doesn’t show up correctly in your RSS reader of choice, be sure to click through to watch the high-resolution version of the video on YUI Theater.

Subscribing to YUI Theater:

flashblockdetector

2010-03-13T10:44:58Z

flashblockdetector. Mark Pilgrim’s JavaScript library for detecting if the user has a Flash blocker enabled, such as FlashBlock for Firefox and Chrome or ClickToFlash for Safari. One good use of this would be to inform users that they need to opt-in to Flash for unobtrusive Flash enhancements (such as invisible audio players) to work on that page.

Facebook Adds Code for Clickjacking Prevention

2010-03-13T10:42:17Z

Facebook Adds Code for Clickjacking Prevention. Clever technique: Facebook pages check to see if they are being framed (using window.top) and, if they are, add a div covering the whole page which causes a top level reload should anything be clicked on. They also log framing attempts using an image bug.

Speed Tracer by Google (layout profiler included!)

2010-03-13T10:39:19Z

Speed Tracer by Google (layout profiler included!):

Speed Tracer is a tool to help you identify and fix performance problems in your web applications. It visualizes metrics that are taken from low level instrumentation points inside of the browser and analyzes them as your application runs. Speed Tracer is available as a Chrome extension and works on all platforms where extensions are currently supported (Windows and Linux).

Using Speed Tracer you are able to get a better picture of where time is being spent in your application. This includes problems caused by:

Javascript parsing and execution
Layout
CSS style recalculation and selector matching
DOM Event handling
Network resource loading
Timer fires
XMLHttpRequest callbacks
Painting
and more …

Рисовалка от дизайнера Рикардо Кабелло. Фантастика какая-то.

2010-03-13T09:39:42Z

Рисовалка от дизайнера Рикардо Кабелло. Фантастика какая-то.

это же бубльгум (программистское)

2010-03-11T18:58:40Z

Shared by arty
очень круто!

Code Bubbles - интересная идея фундаментально нового устройства IDE. Отдельные пузырьки для методов/классов/данных, которые легко группируются и разъединяются, и существуют на одной огромной виртуальной рабочей площади. По ссылке есть 8-минутное видео, которое все объясняет.

Оригинальная идея. На первый взгляд похоже на среду разработки в Smalltalk - например в Squeak. Но хотя принцип отдельного окошка для каждого метода схожий, в Code Bubbles особенно полезным видится возможность их как угодно ориентировать (они еще и растут сами, когда текст добавляешь), и двигать по виртуальной плоскости, на которой очень легко организовать отдельные "проекты". И все запоминает на будущее. И кстати замечания легко прикреплять.

Возможно, не так хорошо подойдет для других языков, кроме Джавы. В Джаве часто бывает много небольших классов с полу-тривиальным кодом, но надо сквозь них пройти в иерархии, чтобы добраться до чего-то интересного. Пузырьки тут больше помогают, чем в других языках, думаю.

В общем, красиво. Я бы может даже попробовал, чтобы почувствовать, как это. Давно я, кстати, не пользовался IDE как следует. Весь код пишу в vim'е. Хотя ежели кто хочет в IDE - пожалуйста. Священную войну по этому поводу не люблю.

Go language vs. Algol-68

2010-03-10T19:34:46Z

Алексей Капранов

On Go - http://www.cowlark.com/2009-11...

4 hours ago from delicious - Comment - Like

øv, Andrew Shitov, Anton Yuzhaninov and arty liked this

Go language vs. Algol-68 - Алексей Капранов

Google / Google Apps Marketplace

2010-03-10T10:10:01Z

Google реализовала на мой взгляд давно очевидный механизм. Теперь каждый разработчик имеет возможность продавать свой онлайн сервисы через легкий и удобный магазин.

Читать дальше →

NTT docomo is now an OpenID Provider

2010-03-10T07:37:39Z

Shared by arty
Directed Identity, естественно, поддерживается: https://i.mydocomo.com/

Алексей Капранов

NTT docomo is now an OpenID Provider | OpenID - http://openid.net/2010...

8 hours ago from Bookmarklet - Comment - Like

Roman Zolotarev, luseverchik, orie and 2 other people liked this

Половина населения Японии одним махом получила OpenID-идентификаторы. - Алексей Капранов

2 more comments

Ага. «Click here if you are a DoCoMo user». - Алексей Капранов

https://i.mydocomo.com/ у меня в блоге уже сработало - arty

9000: «Why I switched to Pylons after using Django for six months» (reddit)

2010-03-08T23:47:54Z

9000

«Why I switched to Pylons after using Django for six months» (reddit) - http://www.reddit.com/r...

March 8 from Bookmarklet - Comment - Like

arty, A.T. and Алексей Капранов liked this

HTML5 apps

2010-03-08T14:45:15Z

Right now nobody’s interested in a mobile solution that does not contain the words “iPhone” and “app” and that is not submitted to a closed environment where it competes with approximately 2,437 similar mobile solutions.

Compared to the current crop of mobile clients and developers, lemmings marching off a cliff follow a solid, sensible strategy. Startling them out of this obsession requires nothing short of a new buzzword.

Therefore I’d like to re-brand standards-based mobile websites and applications, definitely including W3C Widgets, as “HTML5 apps.” People outside our little technical circle are already aware of the existence of HTML5, and I don’t think it needs much of an effort to elevate it to full buzzwordiness.

Technically, HTML5 apps would encompass all websites as well as all the myriads of (usually locally installed) web-standards-based application systems on mobile. The guiding principle would be to write and maintain one single core application that uses web standards, as well as a mechanism that deploys that core application across a wide range of platforms.

Экран выбора браузера -> 3х загрузок Оперы

2010-03-04T16:05:42Z

За два дня, прошедших с выпуска обновления для европейских Windows-пользователей, число закачек браузера Opera существенно выросло.

Ambilight для тэга video

2010-03-04T01:23:12Z

В некоторых топовых моделях телевизоров Philips есть такая прикольная штука, как Ambilight. По сути, это светодиодная подсветка телевизора, которая меняет цвет в зависимости от цвета картинки. Смотреть кино на таком телевизоре — одно удовольствие.

На флэше уже есть реализации такой подсветки, ну а чем мы — фронтовики — хуже? Дабы в очередной раз разобраться, на что способны современные браузеры, на свет появился очередной эксперимент:

Ambilight для тэга <video> (Firefox 3.5, Opera 10.5, Safari 4, Google Chrome 4)

Далее рассмотрим, как это было сделано.

Алгоритм

Прежде, чем начать что-то писать, нужно составить алгоритм, по которому будет работать наша подсветка.

Настоящая подсветка в телевизоре работает примерно так. На задней панели располагается ряд ярких светодиодов, которые светятся разными цветами. Причём цвет диода примерно соответствует цвету области изображения, напротив которой он находится. Когда картинка меняется, светодиод плавно меняет свой цвет на другой.

Исходя из этого описания, нам нужно проделать следующее: определить цвет каждого диода для текущего кадра и отрисовать его свечение. Что ж, приступим.

Определяем цвет диода

Для удобства предположим, что в нашем «телевизоре» всего по 5 светодиодов с каждой стороны. Соответственно, нужно взять фрагмент кадра, разделить его на области по количеству диодов и найти усреднённый цвет в каждой области — это и будут цвета подсветки:

Чтобы получить изображение текущего видео-кадра, достаточно отрисовать его в <canvas> через метод drawImage():

var canvas = document.createElement('canvas'),
	video = document.getElementsByTagName('video')[0],
	ctx = canvas.getContext('2d');

// обязательно выставляем размер холста
canvas.width = video.width;
canvas.height = video.height;

// рисуем кадр
ctx.drawImage(video, 0, 0, video.width, video.height);

Текущий кадр получили, теперь нужно узнать, какого цвета пиксели сбоку изображения. Для этого воспользуемся методом getImageData():

/** Ширина области, которую будем анализировать */
var block_width = 50;

var pixels = ctx.getImageData(0, 0, block_width, canvas.height);

В объекте pixels есть свойство data, в котором содержатся цвета всех пикселей. Причём хранятся они в немного необычном формате: это массив RGBA-компонетнов всех пикселей. К примеру, чтобы узнать цвет и прозрачность первого пикселя, нужно взять первые 4 элемента массива data, второго пикселя — следующие 4 и так далее:

var pixel1 = {
	r: pixels.data[0],
	g: pixels.data[1],
	b: pixels.data[2],
	a: pixels.data[3]
};

var pixel2 = {
	r: pixels.data[4],
	g: pixels.data[5],
	b: pixels.data[6],
	a: pixels.data[7]
};

Нам нужно разделить все полученные пиксели на 5 групп (по количеству светодиодов, которое мы выбрали ранее) и проанализировать каждую группу по очереди:

function getMidColors() {
	var width = canvas.width,
		height = canvas.height,
		lamps = 5, //количество светодиодов
		block_width = 50, // ширина анализируемой области
		block_height = Math.ceil(height / lamps), // высота анализируемого блока
		pxl = block_width * block_height * 4, // сколько всего RGBA-компонентов в одной области
		result = [],

		img_data = ctx.getImageData(0, 0, block_width, h),
		total = img_data.data.length;

	for (var i = 0; i < lamps; i++) {
		var from = i * width * block_width;
		result.push( calcMidColor(img_data.data, i * pxl, Math.min((i + 1) * pxl, total_pixels - 1)) );
	}

	return result;
}

В этой функции мы просто пробегаемся по анализируемым блокам и считаем для них усреднённый цвет с помощью функции calcMidColor(). Нам не нужно применять всякие хитрые формулы, чтобы посчитать усреднённый цвет на области исходя из интенсивности цветов в ней, достаточно посчитать среднее арифметическое для каждого цветового компонента:

function calcMidColor(data, from, to) {
	var result = [0, 0, 0];
	var total_pixels = (to - from) / 4;

	for (var i = from; i <= to; i += 4) {
		result[0] += data[i];
		result[1] += data[i + 1];
		result[2] += data[i + 2];
	}

	result[0] = Math.round(result[0] / total_pixels);
	result[1] = Math.round(result[1] / total_pixels);
	result[2] = Math.round(result[2] / total_pixels);

	return result;
}

Итак, мы получили цвета для светодиодов, но они слишком тусклые: ведь диоды светят очень ярко чтобы добиться достаточного уровня свечения. Нужно увеличить яркость цветов, а также увеличить насыщенность, чтобы добавить глубины свечению. Для этих целей очень удобно пользоваться цветовой моделью HSV — hue, saturation, value, — достаточно домножить два последних компонента на некий коэффициент. Но цвета у нас хранятся в модели RGB, поэтому сначала конвертируем цвет в HSV, увеличиваем яркость и насыщенность, а затем обратно конвертируем в RGB (формулы конвертирования RGB→HSV и обратно легко находятся в интернетах):

function adjustColor(color) {
	color = rgb2hsv(color);
	color[1] = Math.min(100, color[1] * 1.4); // насыщенность
	color[2] = Math.min(100, color[2] * 2.7); // яркость
	return hsv2rgb(color);
}

Рисуем свечение

Светодиоды — это всенаправленные источники света. Для их отображения лучше всего подходят радиальные градиенты: для каждого диода свой градиент. Однако для достижения хорошего визуального результата придётся делать очень много сложных расчётов: нужно учитывать позицию диода, диаметр и затухание свечения, смешивание соседних цветов и так далее. Поэтому мы немного сжульничаем: нарисуем обычный — линейный — градиент, а сверху наложим специальную маску, которая создаст ощущение правдоподобного свечения.

Градиент рисуется просто: сначала создаём его с помощью createLinearGradient(), а потом добавляем цвета через addColorStop() и отрисовываем его:

// для свечения создаём новый холст
var light_canvas = document.createElement('canvas'),
	light_ctx = light_canvas.getContext('2d');

light_canvas.width = 200;
light_canvas.height = 200;

var midcolors = getMidColors(), // полчаем усреднённые цвета

	grd = ctx.createLinearGradient(0, 0, 0, canvas.height); // градиент

for (var i = 0, il = midcolors.length; i < il; i++) {
	grd.addColorStop(i / il, 'rgb(' + adjustColor(midcolors[i]).join(',') + ')');
}

// рисуем градиент
light_ctx.fillStyle = grd;
light_ctx.fillRect(0, 0, light_canvas.width, light_canvas.height);

Получим что-то вроде этого:

Маска

Маску мы нарисуем в фотошопе. Есть замечательный фильтр Lightning Effects (Filters→Render→ Lightning Effects…), который позволяет создавать источники света. Заливаем слой белым цветом и вызываем этот фильтр примерно с такими настройками:

Получим вот такое световое пятно:

Меняем режим наложения на Lighten, дублируем, крутим, меняем масштаб, играемся с прозрачностью, правим уровни и получаем вот такой результат:

Так как изображение чёрно-белое, из него очень легко получить маску, где белый цвет будет прозрачным. И если эту маску наложить поверх градиента, то получим вполне себе симпатичное свечение:

Но самое главное — мы легко сможем менять внешний вид и интенсивность свечения, не прибегая к программированию.

Свечение для левой стороны готово, осталось проделать то же самое для правой стороны, добавить плавную смену подсветок и написать контроллер, который с определённым интервалом будет эту подсветку обновлять. Расписывать это — долго и нудно, проще посмотреть исходник.

UPD: как показал эксперимент, далеко не у всех нормально работает HD-видео (изначально размер ролика был 1280×544), снижение разрешения до 592×256 решило проблему.

Some People Can't Read URLs

2010-03-03T08:47:20Z

Some People Can’t Read URLs. Commentary on the recent “facebook login” incident from Jono at Mozilla Labs. I’d guess that most people can’t read URLs, and it worries me more than any other aspect of today’s web. If you want to stay safe from phishing and other forms of online fraud you need at least a basic understanding of a bewildering array of technologies—URLs, paths, domains, subdomains, ports, DNS, SSL as well as fundamental concepts like browsers, web sites and web servers. Misunderstand any of those concepts and you’ll be an easy target for even the most basic phishing attempts. It almost makes me uncomfortable encouraging regular people to use the web because I know they’ll be at massive risk to online fraud.

Internet Explorer: Global Variables, and Stack Overflows

2010-03-02T09:21:26Z

Internet Explorer: Global Variables, and Stack Overflows. An extremely subtle IE bug—if your recursive JavaScript function is attached directly to the window (global) object, IE won’t let you call it recursively more than 12 times.

Google Acquires Photo Editor Picnik

2010-03-01T21:03:30Z

Online photo editor Picnik has been acquired by Google, as the Picnik blog announces. The Picnik team is excited, writing that “It means we can think BIG. Google processes petabytes of data every day, and with their worldwide infrastructure and world-class team, it is truly the best home we could have found.” TechCrunch comments that “Interestingly, Picnik is Flickr’s default photo editor”... Flickr being a competitor to Google’s Picasa Web Albums.

A built-in image editor would make some sense in a whole lot of Google tools. Blogger, for instance, or Picasa Web Albums, or Google Presentations (beyond just vector-based editing), even Google image search (for, say, a quick contrast increasing of a pic you’ve found). A stand-alone photo editing app could be interesting too; for one thing, you can’t just install Photoshop on Google Chrome OS. Not sure if we’ll see the existing Picnik app itself surface in Google world, but it seems at least the skill set of the Picnik team could come in handy for Google if they plan any of these efforts.

[Thanks RiyAndroid!]

[By Philipp Lenssen | Origin: Google Acquires Photo Editor Picnik | Comments]

[Advertisement] Directory Journal: Search Engine Friendly Directory

What the JavaScript RegExp API Got Wrong, & How to Fix It

2010-03-01T09:56:42Z

Over the last few years, I've occasionally commented on JavaScript's RegExp API, syntax, and behavior on the ES-Discuss mailing list. Recently, JavaScript inventor Brendan Eich suggested that, in order to get more discussion going, I write up a list of regex changes to consider for future ECMAScript standards (or as he humorously put it, have my "95 [regex] theses nailed to the ES3 cathedral door"). I figured I'd give it a shot, but I'm going to split my response into a few parts. In this post, I'll be discussing issues with the current RegExp API and behavior. I'll be leaving aside new features that I'd like to see added, and merely suggesting ways to make existing capabilities better. I'll discuss possible new features in a follow-up post.

For a language as widely used as JavaScript, any realistic change proposal must strongly consider backward compatibility. For this reason, some of the following proposals might not be particularly realistic, but nevertheless I think that a) it's worthwhile to consider what might change if backward compatibility wasn't a concern, and b) in the long run, all of these changes would improve the ease of use and predictability of how regular expressions work in JavaScript.

Remove RegExp.prototype.lastIndex and replace it with an argument for start position

Actual proposal: Deprecate RegExp.prototype.lastIndex and add a "pos" argument to the RegExp.prototype.exec/test methods

JavaScript's lastIndex property serves too many purposes at once:

It lets users manually specify where to start a regex search: You could claim this is not lastIndex's intended purpose, but it's nevertheless an important use since there's no alternative feature that allows this. lastIndex is not very good at this task, though. You need to compile your regex with the /g flag to get lastIndex to be used this way; and even then, it only specifies the starting position for the regexp.exec/test methods. It cannot be used to set the start position for the string.match/replace/search/split methods.
It indicates the position where the last match ended: Even though you could derive the match end position by adding the match index and length, this use of lastIndex serves as a convenient and commonly used compliment to the index property on match arrays returned by exec. Like always, using lastIndex like this works only for regexes compiled with /g.
It's used to track the position where the next search should start: This comes into play, e.g., when using a regex to iterate over all matches in a string. However, the fact that lastIndex is actually set to the end position of the last match rather than the position where the next search should start (unlike equivalents in practically all programming languages) causes a problem after zero-length matches, which are easily possible with regexes like /\w*/g or /^/mg. Hence, you're forced to manually increment lastIndex in such cases. I've posted about this issue in more detail before (see: An IE lastIndex Bug with Zero-Length Regex Matches), as has Jan Goyvaerts (Watch Out for Zero-Length Matches).

Unfortunately, lastIndex's versatility results in it not working ideally for any specific use. I think lastIndex is misplaced anyway; if you need to store a search's ending (or next-start) position, it should be a property of the target string and not the regular expression. Here are three reasons this would work better:

It would let you use the same regex with multiple strings, without losing track of the next search position within each one.
It would allow using multiple regexes with the same string and having each one pick up from where the last one left off.
If you search two strings with the same regex, you're probably not expecting the search within the second string to start from an arbitrary position just because a match was found in the first string.

In fact, Perl uses this approach of storing next-search positions with strings to great effect, and adds various features around it.

So that's my case for lastIndex being misplaced, but I go one further in that I don't think lastIndex should be included in JavaScript at all. Perl's tactic works well for Perl (especially when considered as a complete package), but some other languages (including Python) let you provide a search-start position as an argument when calling regex methods, which I think is an approach that is more natural and easier for developers to understand and use. I'd therefore fix lastIndex by getting rid of it completely. Regex methods and regex-using string methods would use internal search position trackers that are not observable by the user, and the exec and test methods would get a second argument (called pos, for position) that specifies where to start their search. It might be convenient to also give the String methods search, match, replace, and split their own pos arguments, but that is not as important and the functionality it would provide is not currently possible via lastIndex anyway.

Following are examples of how some common uses of lastIndex could be rewritten if these changes were made:

Start search from position 5, using lastIndex (the staus quo):

var regexGlobal = /\w+/g,
    result;

regexGlobal.lastIndex = 5;
result = regexGlobal.test(str);
// must reset lastIndex or future tests will continue from the
match-end position (defensive coding)
regexGlobal.lastIndex = 0;

var regexNonglobal = /\w+/;

regexNonglobal.lastIndex = 5;
// no go - lastIndex will be ignored. instead, you have to do this
result = regexNonglobal.test(str.slice(5));

Start search from position 5, using pos:

var regex = /\w+/, // flag /g doesn't matter
    result = regex.test(str, 5);

Iteration, using lastIndex:

var regex = /\w*/g,
    matches = [],
    match;

// the /g flag is required for this regex. if your code was provided a non-
// global regex, you'd need to recompile it with /g, and if it already had /g,
// you'd need to reset its lastIndex to 0 before entering the loop

while (match = regex.exec(str)) {
    matches.push(match);
    // avoid an infinite loop on zero-length matches
    if (regex.lastIndex == match.index) {
        regex.lastIndex++;
    }
}

Iteration, using pos:

var regex = /\w*/, // flag /g doesn't matter
    pos = 0,
    matches = [],
    match;

while (match = regex.exec(str, pos)) {
    matches.push(match);
    pos = match.index + (match[0].length || 1);
}

Of course, you could easily add your own sugar to further simplify match iteration, or JavaScript could add a method dedicated to this purpose similar to Ruby's scan (although JavaScript already sort of has this via the use of replacement functions with String.prototype.replace).

To reiterate, I'm describing what I would do if backward compatibility was irrelevant. I don't think it would be a good idea to add a pos argument to the exec and test methods unless the lastIndex property was deprecated or removed, due to the functionality overlap. If a pos argument existed, people would expect pos to be 0 when it's not specified. Having lastIndex around to sometimes screw up this expectation would be confusing and probably lead to latent bugs. Hence, if lastIndex was deprecated in favor of pos, it should be a means toward the end of removing lastIndex altogether.

Remove String.prototype.match's nonglobal operating mode

Actual proposal: Deprecate String.prototype.match and add a new matchAll method

String.prototype.match currently works very differently depending on the whether the /g (global) flag has been set on the regex provided as the first argument:

For regexes with /g: If no matches are found, null is returned; otherwise an array of simple matches is returned.
For regexes without /g: The match method operates as an alias of regexp.exec. If a match is not found, null is returned; otherwise you get an array containing the (single) match in key zero, with any backreferences stored in the array's subsequent keys. The array is also assigned special index and input properties.

The match method's non-global mode is confusing and unnecessary. The reason it's unnecessary is obvious: If you want the functionality of exec, just use it (no need for an alias). It's confusing because, as described above, the match method's two modes return very different results. The difference is not merely whether you get one match or all matches—you get a completely different kind of result. And since the result is an array in either case, you have to know the status of the regex's global property to know which type of array you're dealing with.

I'd change String.prototype.match by making it always return an array containing all matches in the target string. I'd also make it return an empty array, rather than null, when no matches are found (an idea that comes from Dean Edwards's base2 library). If you want the first match only or you need backreferences and extra match details, that's what regexp.exec is for.

Unfortunately, if you want to consider this change as a realistic proposal, it would require some kind of language version or mode based switching of the match method's behavior (unlikely to happen, I would think). So, instead of that, I'd recommend deprecating the match method altogether in favor of a new method (perhaps RegExp.prototype.matchAll) with the changes prescribed above.

Get rid of /g and RegExp.prototype.global

Actual proposal: Deprecate /g and RegExp.prototype.global, and add a boolean replaceAll argument to String.prototype.replace

If the last two proposals were implemented and therefore regexp.lastIndex and string.match were things of the past (or string.match no longer sometimes served as an alias of regexp.exec), the only method where /g would still have any impact is string.replace. Additionally, although /g follows prior art from Perl, etc., it doesn't really make sense to have something that is not an attribute of a regex stored as a regex flag. Really, /g is more of a statement about how you want methods to apply their own functionality, and it's not uncommon to want to use the same pattern with and without /g (currently you'd have to construct two different regexes to do so). If it was up to me, I'd get rid of the /g flag and its corresponding global property, and instead simply give the string.replace method an additional argument that specifies whether you want to replace the first match only (the default handling) or all matches. This would have the additional benefit of allowing replace-all functionality with nonregex searches.

Note that SpiderMonkey already has a proprietary third argument ("flags") for string.replace that this proposal would conflict with. I doubt this conflict would cause much heartburn, but in any case, a new replaceAll argument would provide the same functionality that SpiderMonkey's flags argument is most useful for (that is, allowing global replacements with nonregex searches).

Change the behavior of backreferences to nonparticipating groups

Actual proposal: Make backreferences to nonparticipating groups fail to match

I'll keep this brief since David "liorean" Andersson and I have previously argued for this on ES-Discuss and elsewhere. David posted about this in detail on his blog (see: ECMAScript 3 Regular Expressions: A specification that doesn't make sense), and I've previously touched on it here (ECMAScript 3 Regular Expressions are Defective by Design). On several occasions, Brenden Eich has also stated that he'd like to see this changed. The short explanation of this behavior is that, in JavaScript, backreferences to capturing groups that have not (yet) participated in a match always succeed (i.e., they match the empty string), whereas the opposite is true in all other regex flavors: they fail to match and therefore cause the regex engine to backtrack or fail. JavaScript's behavior means that /(a|(b))\2c/.test("ac") returns true. The negative implications of this behavior reach quite far when pushing the boundaries of JavaScript regular expressions.

I think everyone agrees that changing to the traditional backreferencing behavior would be an improvement—it provides far more intuitive handling, compatibility with other regex flavors, and great potential for creative use. The bigger question is whether it would be safe, in light of backward compatibility. I think it would be, since I imagine that more or less no one uses the nonintuitive JavaScript behavior intentionally. The JavaScript behavior amounts to automatically adding a ? quantifier after backreferences to nonparticipating groups, which is what people already do explicitly if they actually want backreferences to nonzero-length subpatterns to be optional. Also note that Safari 3 and earlier did not follow the spec on this point and used the more intuitive behavior, although that has changed in more recent versions (notably, this change was due to a write up on my blog rather than reports of real-world errors).

Finally, it's probably worth noting that .NET's ECMAScript regex mode (enabled via the RegexOptions.ECMAScript flag) indeed switches .NET to ECMAScript's unconventional backreferencing behavior.

Make \d \D \w \W \b \B support Unicode (like \s \S . ^ $, which already do)

Actual proposal: Add a /u flag (and corresponding RegExp.prototype.unicode property) that changes the meaning of \d, \w, \b, and related tokens

Unicode-aware digit and word character matching is not an existing JavaScript capability (short of constructing character class monstrosities that are hundreds or thousands of characters long), and since JavaScript lacks lookbehind you can't reproduce a Unicode-aware word boundary. You could therefore say this proposal is outside the scope of this post, but I'm including it here because I consider this more of a fix than a new feature.

According to current JavaScript standards, \s, \S, ., ^, and $ use Unicode-based interpretations of whitespace and newline, whereas \d, \D, \w, \W, \b, and \B use ASCII-only interpretations of digit, word character, and word boundary (e.g., /na\b/.test("naïve") unfortunately returns true). See my post on JavaScript, Regex, and Unicode for further details. Adding Unicode support to these tokens would cause unexpected behavior for thousands of websites, but it could be implemented safely via a new /u flag (inspired by Python's re.U or re.UNICODE flag) and a corresponding RegExp.prototype.unicode property. Since it's actually fairly common to not want these tokens to be Unicode enabled in particular regex patterns, a new flag that activates Unicode support would offer the best of both worlds.

Change the behavior of backreference resetting during subpattern repetition

Actual proposal: Never reset backreference values during a match

Like the last backreferencing issue, this too was covered by David Andersson in his post ECMAScript 3 Regular Expressions: A specification that doesn't make sense. The issue here involves the value remembered by capturing groups nested within a quantified, outer group (e.g., /((a)|(b))*/). According to traditional behavior, the value remembered by a capturing group within a quantified grouping is whatever the group matched the last time it participated in the match. So, the value of $1 after /(?:(a)|(b))*/ is used to match "ab" would be "a". However, according to ESS3/ES5, the value of backreferences to nested groupings is reset/erased after the outer grouping is repeated. Hence, /(?:(a)|(b))*/ would still match "ab", but after the match is completed $1 would reference a nonparticipating capturing group, which in JavaScript would match an empty string within the regex itself, and be returned as undefined in, e.g., the array returned by the regexp.exec.

My case for change is that current JavaScript behavior breaks from the norm in other regex flavors, does not lend itself to various types of creative patterns (see one example in my post on Capturing Multiple, Optional HTML Attribute Values), and in my opinion is far less intuitive than the more common, alternative regex behavior.

I believe this behavior is safe to change for two reasons. First, IE does not implement this rule and follows the more traditional behavior on this point. And second, this is generally an edge case issue to all but hardcore regex wizards, and I'd be surprised to find regexes that rely on this bit of behavior as currently mandated by JavaScript.

Add an /s flag, already

Actual proposal: Add an /s flag (and corresponding RegExp.prototype.dotall property) that changes dot to match all characters including newlines

I'll sneak this one in as a change/fix rather than a new feature since it's not exactly difficult to use [\s\S] in place of a dot when you want the behavior of /s. I presume the /s flag has been excluded thus far to save novices from themselves and limit the damage of runaway backtracking, but what ends up happening is that people write horrifically inefficient patterns like (.|\r|\n)* instead.

Regex searches in JavaScript are seldom line-based, and it's therefore more common to want dot to include newlines than to match anything-but-newlines (although both modes are useful). It makes good sense to keep the default meaning of dot (no newlines) since it is shared by other regex flavors and required for backward compatibility, but adding support for the /s flag is overdue. A boolean indicating whether this flag was set should show up on regexes as a property named either singleline (the unfortunate name from Perl, .NET, etc.) or the more descriptive dotall (used in Java, Python, PCRE, etc.).

Personal preferences

Following are a few changes that would suit my preferences, although I don't think most people would consider them significant issues:

Allow regex literals to use unescaped forward slashes within character clases (e.g., /[/]/). This was already included in the abandoned ES4 change proposals.
Allow an unescaped ] as the first character in character classes (e.g., []] or [^]]). This is allowed in probably every other regex flavor, but creates an empty class followed by a literal ] in JavaScript. I'd like to imagine that no one uses empty classes intentionally, since they don't work consistently cross-browser and there are widely-used/common-sense alternatives ((?!) instead of [], and [\s\S] instead of [^]). Unfortunately, adherence to this JavaScript quirk is tested in Acid3 (test 89), which is likely enough to kill requests for this backward-incompatible but reasonable change.
Change the $& token used in replacement strings to $0. It just makes sense. (Equivalents in other replacement text flavors for comparison: Perl: $&; Java: $0; .NET: $0, $&; PHP: $0, \0; Ruby: \0, \&; Python: \g<0>.)
Get rid of the special meaning of [\b]. Within character classes, the metasequence \b matches a backspace character (equivalent to \x08). This is a worthless convenience since no one cares about matching backspace characters, and it's confusing given that \b matches a word boundary when used outside of character classes. Even though this would break from regex tradition (which I'd usually advocate following), I think that \b should have no special meaning inside character classes and simply match a literal b.

Fixed in ES3: Remove octal character references

ECMAScript 3 removed octal character references from regular expression syntax (although \0 was kept as a convenient exception that allows easily matching a NUL character). However, browsers have generally kept full octal support around for backward compatibility. Octals are very confusing in regular expressions since their syntax overlaps with backreferences and an extra leading zero is allowed outside of character classes. Consider the following regexes:

/a\1/: \1 is an octal.
/(a)\1/: \1 is a backreference.
/(a)[\1]/: \1 is an octal.
/(a)\1\2/: \1 is a backreference; \2 is an octal.
/(a)\01\001[\01\001]/: All occurences of \01 and \001 are octals. However, according to the ES3+ specs, the numbers after each \0 should be treated (barring nonstandard extensions, which are allowed) as literal characters, completely changing what this regex matches.
/(a)\0001[\0001]/: The \0001 outside the character class is an octal; but inside, the octal ends at the third zero (i.e., the character class matches character index zero or "1"). This regex is therefore equivalent to /(a)\x01[\x00\x31]/; although, as mentioned just above, adherence to ES3 would change the meaning.
/(a)\00001[\00001]/: Outside the character class, the octal ends at the fourth zero and is followed by a literal "1". Inside, the octal ends at the third zero and is followed by a literal "01". And once again, ES3's exclusion of octals and inclusion of \0 could change the meaning.
/\1(a)/: Given that, in JavaScript, backreferences to capturing groups that have not (yet) participated match the empty string, does this regex match "a" (i.e., \1 is treated as a backreference since a corresponding capturing group appears in the regex) or does it match "\x01a" (i.e., the \1 is treated as an octal since it appears before its corresponding group)? Unsurprisingly, browsers disagree.
/(\2(a)){2}/: Now things get really hairy. Does this regex match "aa", "aaa", "\x02aaa", "2aaa", "\x02a\x02a", or "2a2a"? All of these options seem plausible, and browsers disagree on the correct choice.

There are other issues to worry about, too, like whether octal escapes go up to \377 (\xFF, 8-bit) or \777 (\u01FF, 9-bit); but in any case, octals in regular expressions are a confusing cluster-cuss. Even though ECMAScript has already cleaned up this mess by removing support for octals, browsers have not followed suit. I wish they would, because unlike browser makers, I don't have to worry about this bit of legacy (I never use octals in regular expressions, and neither should you).

Fixed in ES5: Don't cache regex literals

According to ES3 rules, regex literals did not create a new regex object if a literal with the same pattern/flag combination was already used in the same script or function (although this did not apply to regexes created by the RegExp constructor). A common side effect of this was that regex literals using the /g flag did not have their lastIndex property reset in some cases where most developers would expect it. Several browsers didn't follow the spec on this nonintuitive behavior, but Firefox did, and as a result it became the second most duplicated JavaScript bug report for Mozilla. Fortunately, ES5 got rid of this rule, and now regex literals must be recompiled every time they're encountered (this change is coming in Firefox 3.7).

———
So there you have it. I've outlined what I think the JavaScript RegExp API got wrong. Do you agree with all of these proposals, or would you if you didn't have to worry about backward compatibility? Are there better ways to fix the issues I've pointed out than what I've proposed? Got any other gripes with existing JavaScript regex features? I'm eager to hear feedback about this.

Since I've been focusing on the negative in this post, I'll note that I find working with regular expressions in JavaScript to be a generally pleasant experience. There's a hell of a lot that JavaScript got right.

Записи о веб-разработке, рекомендованные Артемием Трегубенко

Яндекс поддерживает микроформат hCard

Flash Player будет интегрирован в Google Chrome

OpenSSL 1.0.0 released

Jeremy Zawodny: pwnat - NAT to NAT client-server communication

SQL Injection

Q: What's the difference between Ant and Maven? A: The creator of Ant has apologized.

Firefox stops development for Windows Mobile

Video on the Web - Dive Into HTML5

Internet Explorer / [Ссылка] Amazon.com прекращает поддержку IE6

Peer-to-peer connections

Поисковые машины и технологии / Автоматический парсинг товаров с сайтов магазинов

Opera: The media=all of the industry!

Computer Code as a Medium for Human Communication: Are Programming Languages Improving?

Computer Code as a Medium for Human Communication:Are Programming Languages Improving?

Gilles Dubochet, 2009

Исследование показало, что 90% посетителей сайта www.mozilla.org используют…

Сокращатели ссылок тормозят Интернет

Alex Russell: If HTML is just another bytecode container and…

Microsoft's IE9 standards tests vs. reality

Огненный лис / [Перевод] Расширение «Context Font» («Шрифт контекста»)

WidgetSpecs

Яндекс: Хостинг Javascript-библиотек

Вам, разработчики.

удобная работа с классами

YUI Theater — Douglas Crockford: “Crockford on JavaScript — Episode IV: The Metamorphosis of Ajax” (93 min.)

Other Recent YUI Theater Videos:

Subscribing to YUI Theater:

flashblockdetector

Facebook Adds Code for Clickjacking Prevention

Speed Tracer by Google (layout profiler included!)

Рисовалка от дизайнера Рикардо Кабелло. Фантастика какая-то.

это же бубльгум (программистское)

Go language vs. Algol-68

Google / Google Apps Marketplace

NTT docomo is now an OpenID Provider

9000: «Why I switched to Pylons after using Django for six months» (reddit)

HTML5 apps

Экран выбора браузера -> 3х загрузок Оперы

Ambilight для тэга video

Алгоритм

Определяем цвет диода

Рисуем свечение

Маска

Some People Can't Read URLs

Internet Explorer: Global Variables, and Stack Overflows

Google Acquires Photo Editor Picnik

What the JavaScript RegExp API Got Wrong, & How to Fix It

Remove RegExp.prototype.lastIndex and replace it with an argument for start position

Remove String.prototype.match's nonglobal operating mode

Get rid of /g and RegExp.prototype.global

Change the behavior of backreferences to nonparticipating groups

Make \d \D \w \W \b \B support Unicode (like \s \S . ^ $, which already do)

Change the behavior of backreference resetting during subpattern repetition

Add an /s flag, already

Personal preferences

Fixed in ES3: Remove octal character references

Fixed in ES5: Don't cache regex literals

Computer Code as a Medium for Human Communication:

Are Programming Languages Improving?