• 0 Posts
  • 22 Comments
Joined 5 years ago
cake
Cake day: January 29th, 2021

help-circle



  • Much of it might be freely available data, but there’s a huge difference between you accessing a website for data and an LLM doing the same thing. We’ve had bots scraping websites since the 90’s, it’s not a new thing. And since scraping bots have existed we’ve developed a standard on the web to deal with it, called “robots.txt”. A text file telling bots what they are allowed to do on websites and how they should behave.

    LLM’s are notorious for disrespecting this, leading to situations where small companies and organisations will have their websites scraped so thoroughly and frequently that they can’t even stay online anymore, as well as skyrocketing their operational costs. In the last few years we’ve had to develop ways just to protect ourselves against this. See the “Anubis” project.

    Hence, it’s much more important that LLM’s follow the rules than you and me doing so on an individual level.

    It’s the difference between you killing a couple of bees in your home versus an industry specialising in exterminating bees at scale. The efficiency is a big factor.










  • Maybe because it’s not an obviously wanted feature? But I’m just guessing. You should request it and see what happens, maybe more people want it. I’ve never even thought about it, since in the case of Podman/docker it’s so “obvious” and easy to just mount network shares to the host first. And in the case of Kubernetes you can just mount NFS shares directly into pods.





  • GunnarGrop@lemmy.mltoSelfhosted@lemmy.worldMini pc arriving tomorrow
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    1 year ago

    The Beeline is definitely powerful enough to run a hypervisor, so I would do that if I were you. Proxmox is a very good product and easy enough to use. Personally I use Harvester (with Rancher) but that might be a bit daunting if you’ve not used Kubernetes before.

    I would recommend running Proxmox as your OS, spin up a few Debian virtual machines and run your services (Nextcloud, plex/jellyfin, …) with Docker containers. I would personally use Podman, as I think it’s the simpler one to use, but there might be more documentation online for Docker, I’m not sure. But do definitely use containers! You’ll thank yourself in 6 months.

    For reverse proxy I would suggest using Traefik, especially is your using Docker/Podman. But there are other good solutions like Nginx Proxy Manager, which has the advantage of being very easy to use. But I do run Traefik on every Podman server I have or any Kubernetes cluster. That way I can just have a wildcard DNS entry for an IP and then every proxy route will just work, whitout having to touch the DNS further.

    Also, just a general tip: look into how you can deploy everything using a GitOps flow. Whether that just be with Ansible or more specialized solutions (Kubernetes with ArgoCD or FluxCD is very well suited for this). Look into Terraform/OpenTofu. This last point is nowhere necessary, but if you ever (like me) get tired of forgetting how you setup your infrastructure (virtual machines, application deployments and configuration, etc) you’ll love GitOps.

    Oh, but do definitely look into Ansible for configuring your servers. It will save you a lot of time in the long run.