A clickable visual guide to the Rust type system

(rustcurious.com)

249 points | by ashvardanian a day ago ago

41 comments

craftkiller a day ago ago
This is such a small thing, but I love the inclusion of the value ranges for the integers! I can never remember which side can go one deeper ("is it [-128 to 127] or [-127 to 128]"). Bookmarking this for reference later!
[-]
- newpavlov a day ago ago
  Tangential note: I sometimes wish that signed integers were symmetrical. i8 would represent the range of [-127 to 127] with 0xFF representing NaN. Any operation which can not be computed (division by zero, overflows, operation with another NaN, etc.) would result in NaN. For further symmetry we could do the same for signed integers as well.
  Yes, it's possible to encode such types manually, but it will not be efficient since CPUs do not natively support such operations.
  [-]
  - lock1 a day ago ago
    Wouldn't this make CPU flags useless? I think it would complicate branch instructions too, as most modern CPUs tend to use integer operations for branching.
    Also, this in-band signaling probably would invite something similar to `null` mess in type systems. I can't wait to tell CPU to JMP NaN.
    [-]
    - newpavlov a day ago ago
      >Wouldn't this make CPU flags useless?
      They would, but I agree with RISC-V here, CPUs should not rely on them in the first place.
      I do not understand your argument about branches, how would it hinder the jump instructions?
      We still would need separate "wrapping" instructions (e.g. for implementing bigints and cryptographic algorithms), but they probably could be limited to unsigned operations only.
      >I can't wait to tell CPU to JMP NaN.
      How is it different from jumping to null? If you do such jump, it means you have a huge correctness problem with your code.
      [-]
      - lock1 a day ago ago
        > I do not understand your argument about branches, how would it hinder the jump instructions?
        Extra set of logic for handling NaN cases? I don't think it's impossible, just kind of less intuitive. Jump instruction using integer w/o NaN always valid, while NaN-able integer sometimes invalid (ignoring whether the memory address can be accessed).
        [-]
        newpavlov a day ago ago
        For absolute jumps you don't need extra logic, since CPUs could declare the last page always unmapped, so such jumps would always result in a page fault (similarly to the null page on most systems).
        For relative non-immediate jumps the added logic is extremely simple (hardware exception on NaN) and should not (AFAIK) hinder performance of jumps in any way.
  - zokier a day ago ago
    That sounds surprisingly reasonable idea for signeds. Less so for unsigneds though. Has there been any architecture doing anything like that?
    [-]
    - newpavlov a day ago ago
      I can not name an ISA with such instructions out of my head.
      As for unsigned integers, as I mentioned in the other comment, we probably need two separate instruction sets for "wrapping" and NaN-able operations on unsigned integers.
- throwawaymaths a day ago ago
  It's always negative. 0xFFFF... Cannot have a two's complement, and the top bit is set.
  [-]
  - delusional a day ago ago
    I find that the easiest way to remember it is to remember that 0 is positive but has no negative counterpart.
    [-]
    - high_priest a day ago ago
      The 0 is positive is not true, but some day you are hopefully going to get it.
      The true answer is that negative numbers have the top bit set, which can't be used for positive numbers. Hence positives are one bit short.
      [-]
      - delusional 15 hours ago ago
        Youre literally saying the same thing as me.
        All negative numbers have the most significant bit set and 0 is the number with no bits set, ergo 0 must be positive since the most significant bit is not set.
        Now arithmatically, this is untrue. We'll usually treat 0 as neither positive nor negative (or in certain cases both negative and positive) but bitwise, In terms of twos-complement implementation, Zero is positive. We know that since it exists in the unsigned version of the types as well.
        Hopefully you'll see that some day.
- jibal a day ago ago
  I can't imagine suffering from that. Understanding twos complement representation is an essential programming skill. And a byte value of 128? What is that in hex?
  [-]
  - dzaima a day ago ago
    You could pretty easily have an integer representation using [-127; 128]; 128 being 0x80 of course (all other values being the same as in two's complement). Still would hold that -n == 1 + ~n, zero is all-zeroes, and the property that add/sub needn't care about signed vs unsigned. Only significant difference being that top bit doesn't determine negativeness, though of course it's still "x < 0" in code. (at the hardware level, sign extension & comparisons might also get very slightly more complicated, but that's far outside what typical programmers would need to know)
    For most practical purposes outside of low-level stuff all that really matters about two's complement is Don't Get Near 2^(width-1) Or Bad™ Things Happen. Including +128 would even have the benefit of 1<<7 staying positive.
    [-]
    - moefh a day ago ago
      > Only difference being that you need to do a bit more work to determine negativeness (work which in hardware you'd already likely have the bulk of for determining is-zero).
      The work needed to calculate the overflow flag (done in every add/sub operation in most ISAs) is also way more complicated when the high bit does not represent sign.
      [-]
      - dzaima a day ago ago
        Oh, true. Even further down low-level/frequently-unused details though; and RISC-V does without it (/ flags in general) roughly fine.
  - AnIrishDuck a day ago ago
    > Understanding twos complement representation is an essential programming skill
    The field of programming has become so broad that I would argue the opposite. The vast majority of developers will never need to think about let alone understand twos complement as a numerical representation.
  - wubrr a day ago ago
    > Understanding twos complement representation is an essential programming skill.
    It is completely irrelevant for the vast majority of programming.
  - oconnor663 a day ago ago
    What is your goal for this comment?
  - koakuma-chan a day ago ago
    I have no idea what is twos complement representation
    [-]
    - koakuma-chan a day ago ago
      It just means the most significant bit represents the sign?
      [-]
      - craftkiller a day ago ago
        It's a little bit more complicated than that. If only the most significant bit represented the sign then you'd have both positive and negative zero (which is possible with floats), and you'd only be able to go from [-127 to 127]. Instead, it's some incantation where the MSB is the sign but then you flip all the bits and add 1. It is only relevant for signed integers, not unsigned integers.
        [-]
        pests a day ago ago
        Ben Eater has a really good YT video on this.
      - lock1 a day ago ago
        That's called "ones complement", the most significant bit represents a sign. Like the sibling post mentioned, it does have a weird quirk of having 2 representations for 0: (-0) and (+0).
        While "twos complement" turns the MSB unsigned value to a negative instead of a positive. For example, 4-bit twos complement: 1000 represents -8 (in unsigned 4-bit, this supposed to be +8), 0100 represents 4, 0010 represents 2, 0001 represents 1. Some more numbers: 7 (0111), -7 (1001), 1 (0001), -1 (1111).
        Intuitively, "ones complement" MSB represents a multiplication by (-1). While "twos complement" MSB adds (-N), with N = 2^(bit length - 1), in case of 4-bit twos complement it's (-2^3) or (-8). Both representation leave non-MSB bits work exactly like unsigned integer.
        [-]
        9 hours ago ago
        [deleted]
      - harpiaharpyja 9 hours ago ago
        The other replies do a good job of explaining what 2s complement is.
        I find the best way to understand why 2s complement is so desirable is to write down the entire number line for e.g. 3-bit integers.
        Using 1s complement, the negative numbers are backwards. 2s complement fixes this, so that arithmetic works and you can do addition and subtraction without any extra steps.
        (Remember that negative numbers are less than positive numbers, so the correct way to count them is:
        -8 -7 -6 -5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 +6 +7
        Where -1 is the largest possible negative number)
  - craftkiller a day ago ago
    Eh, how often are you going down to the bit representation of signed integers? Naturally I learned two's complement ages ago, but all of my bitwise manipulation seems to be on unsigned integers (and frankly I've only used bitwise operations at work once for implementing bloom filters. Normally I only get to do lower level stuff like that in side-projects). So internalizing two's complement has never seemed relevant.
    > And a byte value of 128? What is that in hex?
    0x80
    [-]
    - jibal 17 hours ago ago
      > 0x80
      Which is of course has the sign bit set.
      The comments here are educational ... I hadn't realized that the field of programming had become this degraded.
      [-]
      - superblas 5 hours ago ago
        Such needless condescension, jibal.
goku12 a day ago ago
Adding another resource I use frequently: https://cheats.rs/
One part that I love especially about it is that it represents lifetimes [1] and memory layout [2] of data structures in graphical format. They're as invaluable as API references. I would love to see it included in other documentation as well.
[1] https://cheats.rs/#memory-lifetimes
[2] https://cheats.rs/#memory-layout
mattlutze 7 hours ago ago
I love a page that doesn't react to my browser width.
adastra22 a day ago ago
Why is PhantomData in the unsafe support group?
[-]
- john-h-k a day ago ago
  It obviously can be used for other things but it principally was designed for unsafe support (allowing dropck to understand unsafe types that own a value through a pointer). See https://doc.rust-lang.org/nomicon/phantom-data.html
  [-]
  - saghm a day ago ago
    Interesting, I've had to use it a number of times over the years despite never really doing much unsafe. At least to me, it seems pretty well-scoped as a workaround from the requirements that the compiler has around needing to use generic type parameters in type definitions, which certainly isn't something you need to be writing unsafe code to run into. I wouldn't be shocked if it used unsafe under the hood, but then again, so does Vec.
    [-]
    - afdbcreid a day ago ago
      The original reason to design it (instead of the previously inferred bivariance) was so that unsafe code that really does not want bivariance, and will be unsound if it will be used, will remember to consider that.
      It doesn't use unsafe under the hood, rather it's compiler magic.
    - john-h-k a day ago ago
      > At least to me, it seems pretty well-scoped as a workaround from the requirements that the compiler has around needing to use generic type parameters in type definitions
      The reason those requirements exist is (primarily) to do with unsafe code. Specifically it’s about deciding the variance of the type (which doesn’t matter for a truely unused type parameter).
stmw 21 hours ago ago
It's very good, thanks for getting it some attention. Also - to show how much I agree - https://news.ycombinator.com/item?id=45140572
smj-edison a day ago ago
I really like how it scrolls left-to-right on mobile, instead of collapsing down.
6r17 a day ago ago
There aren't that much of them actually ! Almost feel like an element table
shmerl 19 hours ago ago
Really nice and concise presentation!
wiredpancake 16 hours ago ago
[dead]