A clickable visual guide to the Rust type system

(rustcurious.com)

249 points | by ashvardanian a day ago ago

41 comments

  • craftkiller a day ago ago

    This is such a small thing, but I love the inclusion of the value ranges for the integers! I can never remember which side can go one deeper ("is it [-128 to 127] or [-127 to 128]"). Bookmarking this for reference later!

    • newpavlov a day ago ago

      Tangential note: I sometimes wish that signed integers were symmetrical. i8 would represent the range of [-127 to 127] with 0xFF representing NaN. Any operation which can not be computed (division by zero, overflows, operation with another NaN, etc.) would result in NaN. For further symmetry we could do the same for signed integers as well.

      Yes, it's possible to encode such types manually, but it will not be efficient since CPUs do not natively support such operations.

      • lock1 a day ago ago

        Wouldn't this make CPU flags useless? I think it would complicate branch instructions too, as most modern CPUs tend to use integer operations for branching.

        Also, this in-band signaling probably would invite something similar to `null` mess in type systems. I can't wait to tell CPU to JMP NaN.

        • newpavlov a day ago ago

          >Wouldn't this make CPU flags useless?

          They would, but I agree with RISC-V here, CPUs should not rely on them in the first place.

          I do not understand your argument about branches, how would it hinder the jump instructions?

          We still would need separate "wrapping" instructions (e.g. for implementing bigints and cryptographic algorithms), but they probably could be limited to unsigned operations only.

          >I can't wait to tell CPU to JMP NaN.

          How is it different from jumping to null? If you do such jump, it means you have a huge correctness problem with your code.

          • lock1 a day ago ago

              > I do not understand your argument about branches, how would it hinder the jump instructions?
            
            Extra set of logic for handling NaN cases? I don't think it's impossible, just kind of less intuitive. Jump instruction using integer w/o NaN always valid, while NaN-able integer sometimes invalid (ignoring whether the memory address can be accessed).
            • newpavlov a day ago ago

              For absolute jumps you don't need extra logic, since CPUs could declare the last page always unmapped, so such jumps would always result in a page fault (similarly to the null page on most systems).

              For relative non-immediate jumps the added logic is extremely simple (hardware exception on NaN) and should not (AFAIK) hinder performance of jumps in any way.

      • zokier a day ago ago

        That sounds surprisingly reasonable idea for signeds. Less so for unsigneds though. Has there been any architecture doing anything like that?

        • newpavlov a day ago ago

          I can not name an ISA with such instructions out of my head.

          As for unsigned integers, as I mentioned in the other comment, we probably need two separate instruction sets for "wrapping" and NaN-able operations on unsigned integers.

    • throwawaymaths a day ago ago

      It's always negative. 0xFFFF... Cannot have a two's complement, and the top bit is set.

      • delusional a day ago ago

        I find that the easiest way to remember it is to remember that 0 is positive but has no negative counterpart.

        • high_priest a day ago ago

          The 0 is positive is not true, but some day you are hopefully going to get it.

          The true answer is that negative numbers have the top bit set, which can't be used for positive numbers. Hence positives are one bit short.

          • delusional 15 hours ago ago

            Youre literally saying the same thing as me.

            All negative numbers have the most significant bit set and 0 is the number with no bits set, ergo 0 must be positive since the most significant bit is not set.

            Now arithmatically, this is untrue. We'll usually treat 0 as neither positive nor negative (or in certain cases both negative and positive) but bitwise, In terms of twos-complement implementation, Zero is positive. We know that since it exists in the unsigned version of the types as well.

            Hopefully you'll see that some day.

    • jibal a day ago ago

      I can't imagine suffering from that. Understanding twos complement representation is an essential programming skill. And a byte value of 128? What is that in hex?

      • dzaima a day ago ago

        You could pretty easily have an integer representation using [-127; 128]; 128 being 0x80 of course (all other values being the same as in two's complement). Still would hold that -n == 1 + ~n, zero is all-zeroes, and the property that add/sub needn't care about signed vs unsigned. Only significant difference being that top bit doesn't determine negativeness, though of course it's still "x < 0" in code. (at the hardware level, sign extension & comparisons might also get very slightly more complicated, but that's far outside what typical programmers would need to know)

        For most practical purposes outside of low-level stuff all that really matters about two's complement is Don't Get Near 2^(width-1) Or Bad™ Things Happen. Including +128 would even have the benefit of 1<<7 staying positive.

        • moefh a day ago ago

          > Only difference being that you need to do a bit more work to determine negativeness (work which in hardware you'd already likely have the bulk of for determining is-zero).

          The work needed to calculate the overflow flag (done in every add/sub operation in most ISAs) is also way more complicated when the high bit does not represent sign.

          • dzaima a day ago ago

            Oh, true. Even further down low-level/frequently-unused details though; and RISC-V does without it (/ flags in general) roughly fine.

      • AnIrishDuck a day ago ago

        > Understanding twos complement representation is an essential programming skill

        The field of programming has become so broad that I would argue the opposite. The vast majority of developers will never need to think about let alone understand twos complement as a numerical representation.

      • wubrr a day ago ago

        > Understanding twos complement representation is an essential programming skill.

        It is completely irrelevant for the vast majority of programming.

      • oconnor663 a day ago ago

        What is your goal for this comment?

      • koakuma-chan a day ago ago

        I have no idea what is twos complement representation

        • koakuma-chan a day ago ago

          It just means the most significant bit represents the sign?

          • craftkiller a day ago ago

            It's a little bit more complicated than that. If only the most significant bit represented the sign then you'd have both positive and negative zero (which is possible with floats), and you'd only be able to go from [-127 to 127]. Instead, it's some incantation where the MSB is the sign but then you flip all the bits and add 1. It is only relevant for signed integers, not unsigned integers.

            • pests a day ago ago

              Ben Eater has a really good YT video on this.

          • lock1 a day ago ago

            That's called "ones complement", the most significant bit represents a sign. Like the sibling post mentioned, it does have a weird quirk of having 2 representations for 0: (-0) and (+0).

            While "twos complement" turns the MSB unsigned value to a negative instead of a positive. For example, 4-bit twos complement: 1000 represents -8 (in unsigned 4-bit, this supposed to be +8), 0100 represents 4, 0010 represents 2, 0001 represents 1. Some more numbers: 7 (0111), -7 (1001), 1 (0001), -1 (1111).

            Intuitively, "ones complement" MSB represents a multiplication by (-1). While "twos complement" MSB adds (-N), with N = 2^(bit length - 1), in case of 4-bit twos complement it's (-2^3) or (-8). Both representation leave non-MSB bits work exactly like unsigned integer.

            • 9 hours ago ago
              [deleted]
          • harpiaharpyja 9 hours ago ago

            The other replies do a good job of explaining what 2s complement is.

            I find the best way to understand why 2s complement is so desirable is to write down the entire number line for e.g. 3-bit integers.

            Using 1s complement, the negative numbers are backwards. 2s complement fixes this, so that arithmetic works and you can do addition and subtraction without any extra steps.

            (Remember that negative numbers are less than positive numbers, so the correct way to count them is:

            -8 -7 -6 -5 -4 -3 -2 -1 0 +1 +2 +3 +4 +5 +6 +7

            Where -1 is the largest possible negative number)

      • craftkiller a day ago ago

        Eh, how often are you going down to the bit representation of signed integers? Naturally I learned two's complement ages ago, but all of my bitwise manipulation seems to be on unsigned integers (and frankly I've only used bitwise operations at work once for implementing bloom filters. Normally I only get to do lower level stuff like that in side-projects). So internalizing two's complement has never seemed relevant.

        > And a byte value of 128? What is that in hex?

        0x80

        • jibal 17 hours ago ago

          > 0x80

          Which is of course has the sign bit set.

          The comments here are educational ... I hadn't realized that the field of programming had become this degraded.

          • superblas 5 hours ago ago

            Such needless condescension, jibal.

  • goku12 a day ago ago

    Adding another resource I use frequently: https://cheats.rs/

    One part that I love especially about it is that it represents lifetimes [1] and memory layout [2] of data structures in graphical format. They're as invaluable as API references. I would love to see it included in other documentation as well.

    [1] https://cheats.rs/#memory-lifetimes

    [2] https://cheats.rs/#memory-layout

  • mattlutze 7 hours ago ago

    I love a page that doesn't react to my browser width.

  • adastra22 a day ago ago

    Why is PhantomData in the unsafe support group?

    • john-h-k a day ago ago

      It obviously can be used for other things but it principally was designed for unsafe support (allowing dropck to understand unsafe types that own a value through a pointer). See https://doc.rust-lang.org/nomicon/phantom-data.html

      • saghm a day ago ago

        Interesting, I've had to use it a number of times over the years despite never really doing much unsafe. At least to me, it seems pretty well-scoped as a workaround from the requirements that the compiler has around needing to use generic type parameters in type definitions, which certainly isn't something you need to be writing unsafe code to run into. I wouldn't be shocked if it used unsafe under the hood, but then again, so does Vec.

        • afdbcreid a day ago ago

          The original reason to design it (instead of the previously inferred bivariance) was so that unsafe code that really does not want bivariance, and will be unsound if it will be used, will remember to consider that.

          It doesn't use unsafe under the hood, rather it's compiler magic.

        • john-h-k a day ago ago

          > At least to me, it seems pretty well-scoped as a workaround from the requirements that the compiler has around needing to use generic type parameters in type definitions

          The reason those requirements exist is (primarily) to do with unsafe code. Specifically it’s about deciding the variance of the type (which doesn’t matter for a truely unused type parameter).

  • stmw 21 hours ago ago

    It's very good, thanks for getting it some attention. Also - to show how much I agree - https://news.ycombinator.com/item?id=45140572

  • smj-edison a day ago ago

    I really like how it scrolls left-to-right on mobile, instead of collapsing down.

  • 6r17 a day ago ago

    There aren't that much of them actually ! Almost feel like an element table

  • shmerl 19 hours ago ago

    Really nice and concise presentation!

  • wiredpancake 16 hours ago ago

    [dead]