Skip to content

Latest commit

 

History

History
417 lines (211 loc) · 20.4 KB

notes-2019-04-19.md

File metadata and controls

417 lines (211 loc) · 20.4 KB

Attendees:

  • Shane Carr - Google i18n (SC)
  • Myles C. Maxfield - Webkit/Apple (MM)
  • Zibi Braniecki - Mozilla (ZB)
  • Richard Gibson - Oracle/OpenJS Foundation (RG)
  • Frank Tang - V8 (FT)
  • Leo Balter - Bocoup (LB)
  • Val Young - Bocoup
  • Dan Ehrenberg (DE)
  • Felipe Balbontin - Google i18n (FB)
  • Benjamin Coe (BC)
  • Steven Loomis (SL)

Pull Requests

Normative: Add calendar and numberingSystem options (#175)

PR

DE: I think it was stalled at the implementation level. Is anyone interested in taking this on? For context, this is part of a broader effort to create parity between resolved options and options in the locale. I would love someone else to take this up.

ZB: A good first bug. Good mentorship available.

DE: Needs implementation in browsers. I started a patch for V8. Good first issue for someone who wants to work on browsers

FT: DE, can you clarify; you need browser prototyping?

DE: Writing the browser implementation and tests is the remaining work item.

FT: I’ll look through the PR for V8. https://bugs.chromium.org/p/v8/issues/detail?id=7729

DE: My recollection is that we’ve had consensus in this call, and also in TC39, and there weren’t any major concerns. I would like to see this in the browsers behind a flag before landing the PR.

FT: Do you think it’s still OK to be a PR?

DE: I think it’s fine to be a PR.

Normative: Improve handling of non-Gregorian calendars

PR

SC: It looks like this is in a similar situation as the previous PR?

FT: What is the difference between this PR and the previous one?

ZB: The core of this PR is that it adds a “related year”, which is helpful for non-Gregorian calendars specifically. So it’s not about unifying extension keys and options. It’s about handling non-Gregorian calendars better.

DE: There are 2 parts to this change. The first is the “Related Year” addition, which is useful in Chinese calendars. The second is in the data model of Intl, our mini CLDR where we have data in the specification. There was a weird error where the data should be keyed by both calendar and locale, but it wasn’t being keyed by both, which results in an unexpected code path. That data model doesn’t make any sense at all. So it means that there’s no great way to format calendars with other fields in them. Now that we have in Test262 you can put things together that are locale-dependent, we can test the related year.

FT: I guess I need to study the related year field. So it’s an option that could be specified, right?

DE: Right, there’s the related year field that can be specified, and then there’s the change in the data model to make locale data model selected by both locale and calendar.

FT: I’ll take a lead in V8 to follow up there. https://bugs.chromium.org/p/v8/issues/detail?id=9155

DE: Do any other implementers have concerns about this or the previous PR?

MM: I don’t think we have any concerns.

FT: What is the collator thing about here?

DE: Maybe that belongs in a separate PR.

Normative: Support BigInt in NumberFormat and toLocaleString

PR

SC: Thanks to DE for getting the decision for this PR and FT for implementing this in V8!

FT: I already land it into v8 last night and will ship with chrome m76.

DE: I plan to push BigInt to Stage 4 in the June meeting. I think Mozilla is about ready. BigInt is behind a flag in WebKit in progress. Kyle is working on it.

LB: I’m happy to land this PR once BigInt is Stage 4.

LB: I think BigInt causes changes to ECMA 262 as well. Are there PRs for BigInt into ECMA 402? DE, thoughts?

SC: Can you repeat your question?

LB: Are there any other observations in ECMA 402 that need to be changed after ECMA 262?

DE: Kyle is working on the PR for 262. I don’t think there are any extra ECMA 402 changes that are necessary. Because APIs will throw if they’re not ready to accept BigInt.

Normative: Permit use of implementation-defined tailoring for case mapping

PR

VY: This is a later iteration on something DE started.

SC: Is there anything else blocking this PR? Do we need implementations?

VY: This allows the use of an additional tailoring for case mapping. Does it need implementation when we’re just allowing browsers to use more data than Unicode?

DE: V8 has already been doing this for years. I think VY has been doing a great job. I think we can iterate and land the PR.

LB: It’s hard to observe this in tests. That’s my only concern. So as long as implementers say they’re OK, we can land as-is.

MM: No feedback from me.

LB: Can we confirm that there are no blockers for landing this after the editorial changes are done?

FT: I have no concerns.

LB: Thanks! I’m considering the overall silence to be positive consensus. We only have editorial cleanups to finish the work in that PR.

Active Proposals Updates

Intl.ListFormat

Proposal

FT: All three styles are applicable to all three types.

DE: Yeah, I reviewed and accepted that change.

SC: What are the blockers for bringing this to Stage 4?

FT: I think we just need another implementation.

LB: Would a polyfill count?

DE: Yeah, there’s a high-quality polyfill that someone put together. I would like to invite that person to come to one of our meetings.

LB: Considering the polyfill, I think we could bring this for Stage 4.

DE: I would rather wait for 2 browser implementations. Polyfills are really good, but I don’t think we should promote to Stage 4 because of 1 browser and a polyfill.

FT: I agree.

LB: Do we have status from Mozilla?

DE: Anba has a patch which is waiting for review. So I expect there will be something landed within 6 months.

FT: Is there a way we can engage more with the Safari folks?

MM: It’s pretty low-priority for us.

DE: Would you be open to external contributions to JSC?

MM: That’s a fantastic idea.

DE: So from that matter it’s likely someone putting the time into it.

Intl.Locale

Proposal

SC: Is the status similar for Intl.Locale?

DE: Yeah, Anba also has an implementation that is on the review queue for mozilla.

Intl.RelativeTimeFormat

Proposal

SC: Status here?

FT: I opened an issue to ask whether the format method should accept a BigInt. DE said this probably won’t come up.

LB: Test262 says that this is implemented except for formatToParts.

DE: Once they get upgraded to ICU 64, they plan to implement formatToParts. I don’t want to promote this to Stage 4 until we have two full implementations.

FT: It took me 2.5 weeks to migrate to ICU 64. So give them some time.

Intl.NumberFormat Unified API Proposal

Proposal

SC: Main issue about units. Blocking issues: Frank has an open PR to migrate to localize number format that this proposal is built around. Not sure the status in Mozilla.

DE: No mozilla implementation that I have heard from, maybe Zibi knows more.

FT: Size issue.

DateTimeFormat dateStyle & timeStyle

Proposal

FT: We flipped the bit to ship this in m76 in chrome!

DE: We got to Stage 3. I’m happy to see this shipping in Chrome.

SC: Do we know the Mozilla status?

DE: I don’t know.

FT: I thought they had something very close.

DE: They had an internal API, but I don’t know if they’ve done the plumbing for the public API.

Intl.DateFormat.prototype.formatRange

Proposal

FB: It got accepted to Stage 3. FT has does a lot of work implementing this in V8. We still need Test262. FT already sent a PR for initial tests for this. We also need implementations in other browsers. I would like what the plans are for implementing this in the near future?

FT: We need to increase test coverage. We have just really basic stuff there right now. I think (1) browsers and (2) better Test262 test coverage are required.

DE: Do you need help from Igalia on the test coverage?

FT: That would be nice.

DE: Are there other things we should work on test coverage for?

FT: I’ll forward that to you.

DE: Also please copy Ms2ger.

FB: We should also work on a Polyfill. Is that a hard requirement for Stage 4?

DE: No, it’s not a requirement. We’ve talked about working on polyfills, but it sounds like people don’t have the time for that. We are missing APIs for older parts of the Intl APIs as well.

Intl.Segmenter

Proposal

Break iteration and Richard's concerns

DE: This is implemented behind a flag in V8, but blocked on decisions about changing the API. I think we should talk through the issues later in this meeting.

Intl.DisplayNames

Proposal

FT: Not much activity from last meeting. ZB signed up to write an introduction to discuss the scope. The other is the error handling. We haven’t decided on that issue. DE also asked about supporting timezones. I said that timezones are more complicated, so my position is that we shouldn’t put that in for version 1.

DE: To clarify, the complexity for timezone is ___? I’m okay saving that for version 2. In terms of error handling, it seems returning undefined makes sense, but I don’t have a strong opinion.

SC: Do you have an intention to move this to Stage 2 at the next TC39?

FT: That’s my intention.

SC: To be clear, both of the open issues should be resolved before Stage 2.

MM: I have no objection to it reaching Stage 2. I just want it to describe the fundamental difference between the strings this API provides and the strings that the website will provide. The introduction needs to state fundamentally what the difference is. What is the criteria for adding new types of translations to this API? We don’t want to open this up to anything. We need a framework to judge future requests.

SC: I think we need to get this written up before next ECMA 402 meeting.

SL: Maybe have a positive and negative example of something that could be added.

FT: That might be too complicated?

MM: I’m more interested in a description of what falls on each side of the line. Examples just say that the line is somewhere in between.

ZB: I’ll try to get back to this next week.

Discussions

Board

New Board

SC: (explains.) We should document this.

LB: We should make a wiki page to give the descriptions.

Is this an API for iterating segments, or boundaries? #59

RG: Yeah, so the issue is whether to make the iterator describe segments or breaks between segments. We agreed a few meetings ago to iterate over boundaries. Now we have to figure out what that looks like. There are various issues discussed in issue #59. It blossoms into a question on whether this is the right API? We have a PR from DE to do the first step of this. The open question on that is around naming. Then we get to a few others as follow-ups.

DE: Yeah, I was confused how the follow-ups were related.

RG: Sure. So the first one is about break type. There is a getter on the iterator itself about the granularity of iteration. If we include that, it just needs a better specification, because when you visit a boundary, it matters what is first versus last. A word boundary is not clearly differentiated from a non-word boundary in the text. Now that we’re iterating over boundaries…

DE: That makes sense. Do you have an idea how to specify that?

RG: My first thought is to remove it.

DE: We discussed a while ago whether to include this. It’s fuzzy because it’s not clear what segment types are. But my sense is that this is really important for word vs. non-word, or for sentence breaks. So if we remove it, do you think people would create their own regexps to tell what’s going on?

SL: Are you saying to remove breakType?

RG: Yes, from the two locations.

SL: Yes, that’s critical for word break because it tells you whether you’re in a word or non-word.

RG: Well, for breaks, you’re never really in a word.

FT: How do you say, before this point is a word, and after this point is not a word?

RG: I would be open to contemplating ways in which we could better support users for this purpose. What we have now doesn’t do that.

DE: I want to clarify that something that covers this use case needs to be part of the MVP.

RG: That sounds reasonable. Should we discuss that now?

DE: We’ve been stalled on this for awhile.

RG: My question is, if we support backwards iteration, then now you’ve visited it in the opposite direction.

DE: Yeah, backwards iteration is useful for things like find-in-page.

FT: RG’s point is, in the condition of backwards iteration, describing the condition is challenging.

RG: I think backwards iteration is more valuable than break type. That’s why I’m suggesting that we focus on backwards iteration and random access, and then we can figure out break type.

SC: So to be clear, you’re saying remove breakType for the purpose of figuring out the API, but we’re on the same page that we should add back the feature before we’re back in Stage 3?

RG: Yes.

SL: I really don’t support removing breakType.

DE: I think we should talk through the proposal for the API all in one.

RG: Okay, so my proposal is that the preceding and following methods return the exact same thing as ___.

DE: THat was a deliberate design decision not to do that. Because we don’t want to make a bunch of allocations all the time. And performance concerns come up.

RG: Is this a decision we can make in this meeting?

DE: We’ve made that decision before.

FT: To, it used to be returning true or false, and you can check the information in the iterator. So what is the benefit of returning a new object?

RG: I think the values are just different whether you’re iterating forward or backward. So index makes sense. But what properties that are part of the next value should not be replicated onto the iterator itself?

DE: I don't understand; aren't all properties present?

RG: The design has segment, breakType, and index, but segment is not replicated.

DE: Oh, yeah, segment is not included as it's about the iteration, and the others are about the state.

RG: We switched from position to index.

DE: Okay, so index and breakType is a property of the iterator, and the segment can be calculated by the user.

RG: Okay, so we have index and breakType, even though breakType is on shaky ground.

DE: You can make proposals to breakType.

SL: Can you give an example of a backwards iteration where the breakType is not what you expect?

RG: …

FT: Let me paraphrase. Say we have: “hello world”. So basically we have 4 points: the beginning and end of each word. The issue is, if we do word iteration, when you are moving forward, when you hit the break between “hello” and space, then you get one point type. But if you’re moving backward, do you describe it using the same value or a different value? From a forward point of view, you’re moving from word to space. From backward point of view, it’s from space to word.

RG: Yes, that’s what I mean.

DE: Since there are a lot of languages that have word breaks without a word on either side, I’m not sure how you would make a type predicate. It seems like we want to have a type for the segment.

SL: Thank you for restating that. To be clear, RG, I’m just trying to figure out how this should be. You effectively have a different type depending on which direction you’re doing. From that sense, the type could be… you want symmetry going forwards and backwards. So it doesn’t matter what direction you arrived at this break.

FT: So look at it this way. If the spec says this is a word, does that mean you exited a word? It specifies the token before or the token after. But we’re describing a particular break. We’re describing the index between, not the segment before or after.

SL: So it seems to make sense that the types are different in each direction.

RG: The type could be like “afterWord”, which can happen whether you’re moving forward or backward.

SL: We should also not restrict ourselves to word or not word. For example, there are numerics.

RG: That’s on the consumer.

SL: No, that’s part of the text analysis. I don’t want us to restrict this to word or not word.

DE: Right now, word is the only type. We could broaden that to more types.

SL: If the type is only binary, word or not word, that’s problematic.

RG: If we’re moving forward or backward, that type should be the same.

SL: So in thinking of that return value, you may have to do different work. If you’re only going forward through the text, it makes a difference as to what content is there. So in “hello world”, if I start at the beginning of the text, and I have the letter ‘h’. This is a word with letters in it. Next we go forward, and we stop at the point between ‘o’ and space. Going forward, we have a non-word in front of us. If we were going backwards, we would see that there is a word prior to that point. So what we’re trying to come up with, we need to have one return value. The state there is that there is a word before and a non-word after.

SC: Is there a reason we don’t have two separate fields, beforeType and afterType?

RG: That makes sense in concept. It has some issues with the beginning and end of the string. Requirement is a) that the properties are not dependent upon which direction was used, and b) that their interpretation is meaningful regardless of which direction was used.

FT: It should be easy for the developer looking at the spec to figure out what that means.

SL: Let me give a concern. If I’m only going forward through the text, and we provide the type going forward and backward, I’m concerned with the computational complexity. We’d either have to cache the forward value going back, or run the iterator again. It could possibly require the implementation to do more work.

CC: Could the value be, not yet available?

SL: That breaks the invariant of the break type being the same regardless of how you arrived there.

RG: To discover a boundary in the backwards direction, it means you’ve looked at something that precedes it. If the types are word and non-word, for word breaking, the distance you have to look backwards is bounded. For sentences, you have to go back and find the boundary.

SL: What you just described is expensive if it’s unnecessary. The performance of segmentation can be critical at times. For sentences specifically, besides using the exceptions list, that could require traversing quite a number of characters.

RG: I don’t believe so. I think you need to look back only one character. A question mark followed by a newline is only one break.

DE: (gives counter-example)

RG: Yes, but that’s already necessary for boundary discovery. I’m talking about boundary classification once it’s already discovered.

SC: My understanding of ICU break iterator is that backwards iteration already goes back to a stop point and then back forward again. So maybe there’s not extra work.

SL: My concern is eagerly doing computation that the user didn’t ask for.

RG: We could contemplate removing breakType and putting that on the consumer.

FT: We need to clarify the language of the spec.

RG: I would like to see maybe “precedingSegmentType”.

SC: Are there any other issues to discuss in the meeting?

RG: There’s one more aspect I want to visit in this meeting, which is the semantics for preceding and following, and what we want index to represent in that case. For “following,” we have a general desire that if you are going forward and you invoke following with the index … , that is equivalent to normal forward iteration. Right? So if I have an index that corresponds to a boundary, and I say preceding to that index, if the index is already after a boundary, like index 5 in “hello world”, following that you get index 6. What do you get when you get a preceding of 5? Do you get 5 or 0?

FT: I think 0.

SC: I think 0.

RG: I will make sure the spec is consistent here.

Units: Spec and Data Size #39

tc39/proposal-unified-intl-numberformat#39

https://docs.google.com/spreadsheets/d/1Id9Zvf2MvdLdTG9J-AVB1VTdeB2edA2lY9Rlo0g0uVU/edit#gid=0