EDIT: I’m putting this up front so it’s the FIRST thing you see and read: I WAS WRONG I ASSUMED (and I know better) that it wasn’t possible for me to have 3000 accounts created within a day or two of going live. I ASSUMED what I saw was accounts that were NOT local, I WAS WRONG I created a process to remove the bot accounts from my database without crashing my site. I have tested and it looks like all functions are working. If you need help because you suddenly have thousands more accounts than you would suspect ask me for the procedure. I’ll gladly provide it.

I was able to identify bot accounts by looking at creation times. They accounts are grouped by “batches” where the account creation times are within seconds of each other. That’s not typically going to happen with random humans creating accounts.

I used a tool to see how many users my site had. Once I saw the count was larger than expected, I wondered who these users were. I checked the database table and saw a huge list. I know for a fact that all these users are not on my instance. I was able to confirm that the database includes email address and password hash. This SHOULD mean that if someone tries to login, and their authentication information is sitting in my database, they can login at my site locally, correct? I only ask because I did not find an entry anywhere that lists a “home” instance for them to log in to. Am I correct in understanding that accounts are distributed like communities are?

  • Orvanis@lemm.ee
    link
    fedilink
    English
    arrow-up
    11
    ·
    1 year ago

    I was able to confirm that the database includes email address and password hash.

    Uhhhh not loving that if true… Why would password hashes need to be sent all over the planet…? That’s a security bomb just ticking.

    Shouldn’t each instance only need to be tracking user Metadata, with only the original users instance handling authentication…? After all my personal interaction is happening on my instance.

      • 6fn@kbin.social
        link
        fedilink
        arrow-up
        6
        ·
        1 year ago

        Ok, I’ve looked at the source provided and don’t see an e-mail field either. The account e-mail is also limited to your own instance, correct? This thread was making me mildly concerned that e-mails were being shared when federating between instances.

      • Orvanis@lemm.ee
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        Thank you for linking the source! Seems OP was just mistaken about what they were seeing.

  • rknuu@beehaw.org
    link
    fedilink
    English
    arrow-up
    10
    ·
    edit-2
    1 year ago

    Yep, but its only the Metadata[1]. I can’t log in to your instance, but because your instance has consumed content from beehaw from my account I’m listed.

    See https://lemmy.ninja/u/rknuu@beehaw.org

    1. at least I haven’t been able to share logins between instances yet.
    • rknuu@beehaw.org
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      1 year ago

      It’s also worth noting the statistics are only for what you consumed. My profile at beehaw shows very different numbers than yours

      https://beehaw.org/u/rknuu

      Lemmy.ninja: 2 posts, 26 comments

      Beehaw: 14 posts, 72 comments

  • key@lemmy.keychat.org
    link
    fedilink
    English
    arrow-up
    4
    ·
    edit-2
    1 year ago

    Are you looking at person or local_user? The former includes all (known) users across instances and doesn’t include password hash. It has private key column but that should be empty for non-local users. Password hash is only on local_user table which like it says is only local users. If you’re seeing more entries in local_user than you expect that seems more concerning, maybe related to the recently disclosed exploit?

    • MrEUser@lemmy.ninjaOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      edit-2
      1 year ago

      SELECT * from local_user; provides a list of users that has a password_encrypted field. That list is exactly equal (all the same accounts are listed) to what I get from: select   p.name,   p.display_name,   a.person_id,   a.email,   a.email_verified,   a.accepted_application from   local_user a,   person p where   a.person_id = p.id;

      So I can see a persons a.email (email address), a.person_id, and their password_encrypted (hash) by correlating these tables, can I not?

      These accounts are NOT ALL local to my server… So I MUST be being passed hashes, right?

        • MrEUser@lemmy.ninjaOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          I grabbed the first 292 names in that query, there are thousands of results. Then I compared that list to the output of the query that includes “password_encrypted.” There are matches. LOTS of matches. I’ll give you ONE person_id of a result that is in both lists. 38291

          • key@lemmy.keychat.org
            link
            fedilink
            English
            arrow-up
            2
            ·
            1 year ago

            That person ID means nothing outside of your instance. The ID is a sequential number, so it’s saying “the 38,291st person I’ve seen” and each instance’s exact list of people will be different since they’ll see different people in different orders based on what users subscribe to and when.

            Here’s a query that will tell you the instance of that person:

            SELECT i.domain FROM person p INNER JOIN instance i ON p.instance_id=i.id WHERE p.id=38291;

            • MrEUser@lemmy.ninjaOP
              link
              fedilink
              English
              arrow-up
              1
              ·
              1 year ago

              I WAS WRONG I ASSUMED (and I know better) that it wasn’t possible for me to have 3000 accounts created within a day or two of going live. I ASSUMED what I saw was accounts that were NOT local, I WAS WRONG I created a process to remove the bot accounts from my database without crashing my site. I have tested and it looks like all functions are working. If you need help because you suddenly have thousands more accounts than you would suspect ask me for the procedure. I’ll gladly provide it.

              I was able to identify bot accounts by looking at creation times. They accounts are grouped by “batches” where the account creation times are within seconds of each other. That’s not typically going to happen with random humans creating accounts.

              I’m sorry, I included an edit on my original post. If I can do anything else to rectify the problem my assumption caused, let me know. Thank you for your help.

        • MrEUser@lemmy.ninjaOP
          link
          fedilink
          English
          arrow-up
          0
          ·
          edit-2
          1 year ago

          I always assume I’m wrong first, I may have put that in the wrong spot. Where should I put that in the query? I put it under the Select statement.

  • Dead@keylog.zip
    link
    fedilink
    English
    arrow-up
    3
    ·
    edit-2
    1 year ago

    People from different instances are not able to login on your instance. For example you might see me in the people table, but local should be false. People registered on your instance shows up on local_users and on people with local set to true.

    There have been mass registrations going on recently. Imagine my surprise when my 3 people instance had 460 users but no one online today. Took my instance down to investigate and deleted all local users in person table except the known users. Not the cleanest way to do it but I don’t think I broke anything :D

    • thatcasualgamingguy@lemmy.nerdcore.social
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      I think it’s actually pretty clean. Thanks to the references on the person table all related entries in the database like posts, comments, etc. also get deleted if there are any. It’s better than banning all those fake users and thereby spamming the modlog of all instances that federate with you 😅

    • Kerb@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      2
      ·
      edit-2
      1 year ago

      That reminds me of an incident we had.
      On of our customers sites suddenly had hundred’s of new signups.

      It turns out a hacker tried to fuzz the site,
      and since there was no captcha/spam protection in the signup form he created hundred’s of accounts.

      • MrEUser@lemmy.ninjaOP
        link
        fedilink
        English
        arrow-up
        2
        ·
        1 year ago

        I created a process to remove the bot accounts from my database without crashing my site. I have tested and it looks like all functions are working. If you need help because you suddenly have thousands more accounts than you would suspect ask me for the procedure. I’ll gladly provide it.

        I was able to identify bot accounts by looking at creation times. They accounts are grouped by “batches” where the account creation times are within seconds of each other. That’s not typically going to happen with random humans creating accounts.

        • Kerb@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Dont worry, it happened years ago.

          I wouldn’t be taking about that kind of stuff if it was ongoing or recent.

  • ActuallyRuben@actuallyruben.nl
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 year ago

    How are you so certain that they’re not on your instance? I see that your sign up form is open. There’ve been other reports that spambots have discovered Lemmy and are signing up on instances en masse.

    • MrEUser@lemmy.ninjaOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      At this point, I’m not certain anymore. Luckily all the accounts use values that are easy to identify them. I’ll figure out how to remove them. Sorry for the false alarm work.

  • NotBadAndYou@lemmy.fmhy.ml
    link
    fedilink
    English
    arrow-up
    2
    ·
    1 year ago

    We’re entering a world where corporate media - be that music, television, movies, books, even political messaging - will soon be 99% algorithm driven. Train once, repeat ad infinitum. They already have an idea of how long it takes for public appetite to grow tired of the familiar and start craving new, and that will guide how often the algorithm should be re-trained. Meanwhile we, the consumers, will be kept fat and happy in our metaphorical pens being dished up a constant supply of delicious, processed food entertainment manipulation while they milk us for every dollar, minute, and vote that they can.